Machine learning and artificial intelligence (AI) are powerful tools that create predictive models, extract information, and help make complex decisions. They do this by examining an enormous quantity of labeled training data to find patterns too complex for human observation. However, in many real-world applications, well-labeled data can be difficult, expensive, or even impossible to obtain. In some cases, such as when identifying rare objects like new archeological sites or secret enemy military facilities in satellite images, acquiring labels could require months of trained human observers at incredible expense. Other times, as when attempting to predict disease infection during a pandemic such as COVID-19, reliable true labels may be nearly impossible to obtain early on due to lack of testing equipment or other factors. In that scenario, identifying even a small amount of truly negative data may be impossible due to the high false negative rate of available tests. In such problems, it is possible to label a small subset of data as belonging to the class of interest though it is impractical to manually label all data not of interest. We are left with a small set of positive labeled data and a large set of unknown and unlabeled data. Readers will explore this Positive and Unlabeled learning (PU learning) problem in depth. The book rigorously defines the PU learning problem, discusses several common assumptions that are frequently made about the problem and their implications, and considers how to evaluate solutions for this problem before describing several of the most popular algorithms to solve this problem. It explores several uses for PU learning including applications in biological/medical, business, security, and signal processing. This book also provides high-level summaries of several related learning problems such as one-class classification, anomaly detection, and noisy learning and their relation to PU learning.
Preface.- Acknowledgments.- Introduction.- Problem Definition.- Evaluating the Positive Unlabeled Learning Problem.- Solving the PU Learning Problem.- Applications.- Summary.- Bibliography.- Authors' Biographies.
Kristen Jaskie received her Ph.D. in Signal Processing and Machine Learning through the Electrical Engineering department and the SenSIP center at Arizona State University in Tempe, Arizona in 2021 and her B.S. and M.S. degrees in Computer Science with an emphasis in Machine Learning from the University of Washington in Seattle, Washington and the University of California San Diego in San Diego, California, respectively. Kristen is the principal ML research scientist at Prime Solutions Group and a postdoctoral researcher at ASU working with Dr. Spanias in the SenSIP center. Kristen's research interests include machine learning and deep learning algorithm development and application, with a focus on semi-supervised learning and the positive and unlabeled learning problem. She is the author of multiple papers including "Positive and Unlabeled Learning Algorithms and Applications: a Survey," "A Modified Logistic Regression for Positive and Unlabeled Learning," and "PV Fault Detection Using Positive Unlabeled Learning."Andreas Spanias is a Professor in the School of Electrical, Computer, and Energy Engineering at Arizona State University (ASU). He is also the director of the Sensor Signal and Information Processing (SenSIP) center and the founder of the SenSIP industry consortium (also an NSF I/UCRC site). His research interests are in the areas of adaptive signal processing, speech processing, machine learning, and sensor systems. He and his student team developed the computer simulation software Java-DSP and its award-winning iPhone/iPad and Android versions. He is the author of two textbooks: Audio Processing and Coding by Wiley and DSP-An Interactive Approach (2nd Ed.). He contributed to more than 300 papers, 11 monographs, 13 full patents, 6 provisional patents, and 10 patent pre-disclosures. He served as Associate Editor of the IEEE Transactions on Signal Processing and as General Co-chair of IEEE ICASSP-99. He also served as the IEEE Signal Processing Vice-President for Conferences. Andreas Spanias is co-recipient of the 2002 IEEE Donald G. Fink paper prize award and was elected Fellow of the IEEE in 2003. He served as Distinguished Lecturer for the IEEE Signal Processing society in 2004. He is a series editor for the Morgan and Claypool lecture series on algorithms and software. He recently received the 2018 IEEE Phoenix Chapter award with citation: "For significant innovations and patents in signal processing for sensor systems." He also received the 2018 IEEE Region 6 Educator Award (across 12 states) with citation: "For outstanding research and education contributions in signal processing." He was elected recently as Senior Member of the National Academy of Inventors (NAI).