3 years ago

Automatic feed phase identification in multivariate bioprocess profiles by sequential binary classification

Automatic feed phase identification in multivariate bioprocess profiles by sequential binary classification
In this paper, we propose a new strategy for retrospective identification of feed phases from online sensor-data enriched feed profiles of an Escherichia Coli (E. coli) fed-batch fermentation process. In contrast to conventional (static), data-driven multi-class machine learning (ML), we exploit process knowledge in order to constrain our classification system yielding more parsimonious models compared to static ML approaches. In particular, we enforce unidirectionality on a set of binary, multivariate classifiers trained to discriminate between adjacent feed phases by linking the classifiers through a one-way switch. The switch is activated when the actual classifier output changes. As a consequence, the next binary classifier in the classifier chain is used for the discrimination between the next feed phase pair etc. We allow activation of the switch only after a predefined number of consecutive predictions of a transition event in order to prevent premature activation of the switch and undertake a sensitivity analysis regarding the optimal choice of the (time) lag parameter. From a complexity/parsimony perspective the benefit of our approach is three-fold: i) The multi-class learning task is broken down into binary subproblems which usually have simpler decision surfaces and tend to be less susceptible to the class-imbalance problem. ii) We exploit the fact that the process follows a rigid feed cycle structure (i.e. batch-feed-batch-feed) which allows us to focus on the subproblems involving phase transitions as they occur during the process while discarding off-transition classifiers and iii) only one binary classifier is active at the time which keeps effective model complexity low. We further use a combination of logistic regression and Lasso (i.e. regularized logistic regression, RLR) as a wrapper to extract the most relevant features for individual subproblems from the whole set of high-dimensional sensor data. We train different soft computing classifiers, including decision trees (DT), k-nearest neighbors (k-NN), support vector machines (SVM) and an own developed fuzzy classifier and compare our method with conventional multi-class ML. Our results show a remarkable out-performance of the here proposed method over static ML approaches in terms of accuracy and robustness. We achieved close to error free feed phase classification while reducing the misclassification rates in 17 out of 20 investigated test cases in the range between 39% and 98.2% depending on feature set and classifier architecture. Models trained on features based on selection by RLR significantly outperformed those trained on features suggested by experts and their predictive performance was considerably less affected by the choice of the lag parameter.

Publisher URL: www.sciencedirect.com/science

DOI: S0003267017307109

You might also like
Discover & Discuss Important Research

Keeping up-to-date with research can feel impossible, with papers being published faster than you'll ever be able to read them. That's where Researcher comes in: we're simplifying discovery and making important discussions happen. With over 19,000 sources, including peer-reviewed journals, preprints, blogs, universities, podcasts and Live events across 10 research areas, you'll never miss what's important to you. It's like social media, but better. Oh, and we should mention - it's free.

  • Download from Google Play
  • Download from App Store
  • Download from AppInChina

Researcher displays publicly available abstracts and doesn’t host any full article content. If the content is open access, we will direct clicks from the abstracts to the publisher website and display the PDF copy on our platform. Clicks to view the full text will be directed to the publisher website, where only users with subscriptions or access through their institution are able to view the full article.