false
Catalog
Current Applications of AI in Sleep Medicine: AI B ...
Current Applications of AI in Sleep Medicine: AI B ...
Current Applications of AI in Sleep Medicine: AI Beyond the PSG Recording
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Welcome to today's educational event, Current Applications of AI and Sleep Medicine, AI Beyond the PSG, sponsored by the American Academy of Sleep Medicine and co-sponsored by the Artificial Intelligence and Sleep Medicine Committee. Today I will be joining two members of the AI committee as we pose a series of questions about AI tools that are currently used across sleep medicine to our esteemed panel. Today's webinar is a follow-up of our previous presentation on the fundamentals of artificial intelligence. Today we will be addressing the knowledge gaps surrounding artificial intelligence as it pertains to the practical applications and specific tools that we are seeing become available to clinicians in the field of sleep medicine. Now before I introduce our panelists, I'd like to remind you that your audio is muted and your video is on. We encourage you to include any questions on the topic for this speaker in the Q&A section of the Zoom platform. Now let me allow, now please allow me to introduce our panelists for today. My name is Dr. Felicia Jefferson. We have Dr. Anusha Bandipati, who is the chair of the AI and Sleep Medicine Committee, and Dr. Sun. He is also a member of this committee, as am I. The speakers for today include Dr. Mignot. He is the Craig Reynolds Professor of Sleep Medicine at Stanford University. Dr. Mignot discovered that human narcolepsy is caused by the autoimmune loss of approximately 70,000 hypothalamic neurons, creating the weight-promoting peptide hypocretin orexin. He identified HLA, DQB, 0602, and T-cell receptor genes as major susceptibility genes across ethnic groups, which act together to promote a highly selective T-cell-mediated autoimmune process triggered by influenza infection. Dr. Mignot has received numerous awards, including the 2023 Breakthrough Prize. Joining us today is also Dr. Westover. He is a professor in the Department of Neurology at Beth Israel Deaconess Medical Center and Director of Data Sciences at the McCance Center for Brain Health at the Mass General Hospital. Dr. Westover obtained his PhD in physics, working in the field of information theory. He is a board-certified practicing neurologist and clinical neurophysiologist at Beth Israel Deaconess Medical Center and previously at Mass General Hospital. He directed the MGH Critical Care EEG Monitoring Service and serves as the Director of Data Science at the McCance Center for Brain Health. His group develops machine learning AI approaches to improve medical care for patients with neurological conditions. Coach Tan Mignot is the President and CEO of MassFit AR and Chairman of the BC Respiratory and Sleep Providers Association. Mr. Tan Mignot is a registered respiratory therapist who earned his Bachelor of Science degree in cardiopulmonary sciences from Northeastern University. He was trained at the School of Sleep Medicine, a program affiliated with Stanford University Center of Excellence for the Diagnosis and Treatment of Sleep Disorders. He holds copyrights in the U.S. and Canada for facial feature detection and measurements that integrate AI machine learning used for the treatment of obstructive sleep apnea. We will begin with Dr. Mignot. Dr. Mignot will give us a short introduction by looking at hypnodensity and we will go directly into questions from here. Dr. Mignot? Hello. Can everyone see me or hear me? Yes. Okay, good. So, it's a real pleasure to be here and to talk a little bit about one of the passions of my scientific career, which is trying to use artificial intelligence in the diagnosis of sleep disorder, especially pertaining to neurology. So, the two kind of case studies that I want to discuss briefly with you is the case of REM sleep behavior disorder and narcolepsy. Starting with narcolepsy, I'm sure that you are all aware of the fact that deep learning has evolved recently in an exponential fashion as a method that was first applied for visual analysis of recognition of faces or speech recognition. And it's a very simple technique that basically uses networks of multiple decisions that are going to take, for example, a pixel of an image and apply its filters with yes and no. And then at the end, you train a network to decide if, for example, the face is my face or the face of someone else. And similarly, we can apply this to a feature of a sleep study, for example, sleep stages, where we train a network that's going to recognize different stages of sleep. And little by little, the computer will learn all the little filters in the signal that are predicting different features that predict wake, stage one, stage two, stage three. And then after, we can relatively easily plot what we call an hypnodensity, which is basically a representation of the probability of each sleep stage at each epoch. And you can see it a little bit like if it was thousands of scores that would have been scoring this particular hypnogram, and then what's the percent of scores that would decide this is wake, REM, or non-REM. So, for example, on the top, I don't see the picture anymore, but on the top, do you still see it? Okay. On the top, you see, for example, a normal hypnogram, and you see that the machine learning program is giving a probability of wake. And for the first two hours, you see that the person was awake because it's almost 100% probability for being awake. And then there is this little peaks of red line, which suggests that maybe the patients went into some very brief periods of stage one, maybe a little sleep attacks. And then finally, he did fall asleep after two hours, and you see the sliver of stage one as a red line with an increased probability of stage one. And then you see stage two, which is light blue, and then stage three, which is dark blue. And then like a normal hypnogram, you have this very nice first epoch of REM sleep, and then you restart the sleep cycle with stage two, and then stage three, dark blue, and then again REM after a brief arousal. So, you see that what an hypnodensity does, it doesn't simply report the hypnogram, it reports the confidence that you can have in the probability of each sleep stage, like if many, many scores had agreed together and created a probability based on their own scoring. So, that gives a lot more information than just an hypnogram. And in fact, by doing this program that many people have now succeeded to do, that automatically scores sleep stages every 30 seconds or every 15 seconds, and now you can do it even every one second, you can see that we tested a very large number subject, and it was working very well comparing sleep stages that were scored by humans versus what the machine was predicting. But then there was some weird pattern in some specific group of patients, and the patient group that had the strangest pattern for us were narcoleptic patients, and you see an example below, and that will illustrate how this hypnodensity can give you more information than a simple hypnogram. Because indeed, what you see for the patient with narcolepsy down is that first he had a REM sleep period almost immediately after finding his sleep. You see he's white, so he was awake and dark black, so he immediately went to REM sleep, which of course is very rare in normal people, and I'm sure a lot of you know that SOREM at night happened in about 40% of patients with narcolepsy, so it's type one, so it's always useful to look at nocturnal polysomnography to see if a SOREM happens at night, because that can be predictive of type one narcolepsy. But then another phenomenon, yeah, sorry? I have a few questions for you, because I know most of the attendees have read your paper, so I do want to ask these questions, and these are, how is AI being used to diagnose narcolepsy? You started out giving us some background on your studies, but specifically how is AI being used to diagnose narcolepsy? Okay, so if you can just come back to the slide, and then that will be exactly what I was saying. So you see that there are two patterns, one of them is you see that the patient went straight into REM sleep, which you could see with a conventional polysomnography recording, but the other thing is you see this period of time between three and four hours, where bizarrely the program cannot really distinguish between REM sleep, which is dark, or wake, or stage one and stage two. It's all intermingled between three and four hours, and six to eight hours. So basically the program is unable to decide whether or not it's REM sleep, wake, or stage one, and that's really what a patient with narcolepsy experiences. He may be paralyzed but awake, which normally does not happen, and that will be probably the computer has trouble to know is it REM sleep or wake. So by doing this kind of hypnogram based on normal sleep stage scoring, we found that there were some specific features where the programs that automatically score sleep could not distinguish a normal sleep stage, was finding a sleep stage that was a mix of different sleep stages together. And using AI, we use a second program that extracted these features and then, you know, discovered, you know, predicted narcolepsy. So if you go to the next slide, so really what we have developed is a program that uses the automatic sleep scoring, and that allows to identify, for example, if the patient automatically going to REM sleep very quickly, or if there's this odd sleep stages that are in between REM, wake, et cetera, that you don't find in normal people, and then gives a probability of having narcolepsy. And it works almost as well as the MSLT now. So we feel confident that a single night of PSG could work as well as an MSLT in the future. And I think it shows the power of AI. A lot of people think that AI just only copies human, but in fact, sometimes it allows you to discover something new, like this weird sleep stages. And we also now applying it to REM behavior disorder. Yes. And that was going to be my next question. And you pretty much alluded to it about how it's being used in the diagnosis of REM behavior disorder. So thank you. Yeah, I can answer the second question also, which is very important, because of course, you all know that there is narcolepsy type 1 and narcolepsy type 2. And here we know pretty much now that only narcolepsy type 1 is a very homogenous disorder that has a real biochemical cause, which is a lack of orexin. And clearly, this detector only works to detect narcolepsy type 1. In fact, if we test narcolepsy type 2 or idiopathic hypersomnia, most cases, the pattern is different. We don't see this dissociation. And it clearly suggests that narcolepsy type 2 is not as simple. And we probably need to design some new program to diagnose narcolepsy type 2 and idiopathic hypersomnia. Nonetheless, we see a small portion, we are doing that now, maybe 10% of patients with narcolepsy type 2 that do have a pattern like narcolepsy type 1. So maybe we'll be able to differentiate the narcolepsy type 2 that are really due to an hypocretin abnormality from the other, but we're still working on it. So right now, the detector only works for narcolepsy type 1. So for REM behavior disorder, we can use exactly the same principle, because of course, REM behavior disorder is also dissociation, where you have half being awake and half being in REM sleep. And we've similarly used very similar approach. And we have been able to develop also programs that has a relatively high accuracy, about 90% and 95% accuracy to detect RBD versus a clinical standard. But we are not as advanced. We need to do more. Whereas in narcolepsy, we have done 900 narcoleptics versus thousands of controls. For RBD, we have done only 100 RBD versus 100 controls. So we definitely need to do a lot more. And I think so, in fact, that goes straight to the fourth question. Why do we do this? I mean, that's really essential. So one, of course, is to diagnose patients and treat them better and distinguish the narcolepsies that may be due to hypocretin deficiency versus not, because now we know that we have this orexin agonist that are coming on the market. And we know that the patient with narcolepsy type 1 are more sensitive to this drug. So it's going to be very important in the future to know among the narcoleptic which one are likely to have low hypocretin, because they're going to react differently to medication. So I think that's one that makes it very important to maybe have more effective methods for diagnosing narcolepsy. For RBD, of course, you all know that it evolves toward Parkinson's disease. So we, of course, we want to detect that early because everyone knows that for Alzheimer's or Parkinson's disease, we need to start before people have the disease. We need to stop it before. It will be too late once people have dementia. So RBD is very precious because if we could find RBD easily, we could maybe prevent the development of Parkinson's. And that comes to the second part is we need to move away from this PSG methods, which can only work when people go to the lab, to methods that can be used in the large population so that we can stream thousands or hundreds of thousands of people, if not millions of people, to know if they have narcolepsy or RBD. Because sometimes the narcoleptics don't report it to the patient, to the doctor, have mild symptoms. And similarly for RBD, of course, we know that most are undiagnosed. And I think the future in that condition is really not to use only the PSG. Of course, we can use a PSG and everybody that go for sleep apnea for any other reason for full PSG could be diagnosed as narcolepsy or RBD when they are even being screened for sleep apnea. But maybe we need to also develop some new methods, which I am particularly fond of, that I would call multi-modal machine learning. The problem with when you use only one technique like PSG is they will always be false positive because, you know, it's just one technique and one technique has this limitation. But if you add genetics or if you'd add one question, or if you record many more nights, you can become much more sensitive. So, for example, we have done a study with Emmanuel During where we have done actigraphy in combination with a couple of questions about constipation, RBD, and we could obtain like a very, very high accuracy. And actigraphy was during seven nights, or not just one night. So, I think the future is going to probably be to combine very convenient methods, either portable EEG or actigraphy with a couple of questions or with genetics, and then we will be able to achieve a very high specificity, which is important for screening, and be able to screen in the general population. So, I think AI is definitely going to be more and more present, and it will be present in two ways. First, in a sleep lab where, you know, you will be able to have specialized programs that will help you to automatically score sleep study and to identify specific disorders like RBD and narcolepsy, but also potentially predict some, give you an idea of the general health of the patient, and then probably sleep clinicians will move towards using more and more device at home that will be recording sleep or activity over a longer period of time, and combining it with a few questions or very limited clinical evaluation as screening methods. So, I think that's a little bit how I see it. Dr. Mignot, you think it's probably some time before we'll get to being able to fully put this in practice, or at least your AI technology and narcolepsy, do you think that's ready to be applied, or? Yes. So, I'm already working with several companies. I'm giving all my programs for free, but I just want to make sure you know that, but we're publishing all online, but we're working with Anso Data, for example, and they are going to try to incorporate this machine learning program for narcolepsy. We're working with some nomadics, and they're also trying to incorporate that in their own program, and certainly it's freely available for everyone. So, I predict that, yes, that everyone will be using this kind of program in the future, and of course, as you know, it's very important for the SM to make sure that when it's used, it's working, because people are going to use it, and that's why I support very much, and you know, the fact that they need to be a body that checks that the quality is good. Thank you so much, Dr. Mignot, for your information regarding your AI technology and narcolepsy. We'll have a panel discussion at the end of this workshop to discuss more. I do believe that there are some questions that have come in from the audience, and Matt, do you have any of those to ask Dr. Mignot? Yes, we actually have a variety of interests from attendees on different aspects of AI and narcolepsy. I'll kick it off with two on feature recognition. The first is, how accurate is the AI compared to PSG in distinguishing wake from REM? I'm assuming that means manual scoring, and what characteristics does AI use to identify REM sleep and other stages of sleep? That's the first question. So, I will answer very quickly, because that's relatively easy. So, every time you look at a sleep scoring algorithm, you should always look at what we call a confusion table. So, you have a table where you have each sleep stage, stage 1, 2, 3, 1, 2, 3, and REM, and wake, and then what is scored manually, and what's predicted, because the errors are not all equal. So, that's a very, very good point. Is the program as difficult to distinguish wake from REM? Because a program could be good overall, but it's really biased because most epochs are stage 2. So, maybe it would function very well to distinguish stage 2 from everything else, but be horrible at distinguishing wake from REM. Actually, the programs are very effective at distinguishing wake from REM. I can't remember the exact, but it's at least 90 percent or 93 percent. Where it has the most problem is to distinguish REM and stage 1, and stage 1 and wake. Clearly, this is the weakest point, but a lot of people believe that stage 1 is a little bit of a bastard stage. You know, we really don't know what stage 1 is. But interestingly, narcoleptics have a lot more stage 1, and I think that stage 1 may be abnormal, maybe sometimes a little bit REM-y, or maybe a little bit wake-like. So, I hope it answers your question. Unsupervised learning techniques for NT2-NIH? That's a very good question, too. I think you all know that now it has been shown that the MSLT is probably not reliable to differentiate NT2-NIH, because if you repeat it in NT1, it's very consistently positive, but in NT2-NIH, it can flip from one diagnosis to the next. So, I think we need to come back to a clinical definition of IH and NT2, probably, or something of that nature, and then try to use AI to predict what really matters—reaction to drug or REM versus non-REM kind of disease. And I think that really needs a much deeper discussion, because we have to come back to the drawing board because right now we really don't know what is IH and NT2. So we have to come up with a new definition before we can even apply AI to distinguish the disorder. Thank you so much, Dr. Mignot. Dr. Sun has a few questions for Dr. Westover regarding his research as well. So we will move on there and we'll have a panel discussion at the end. Yeah, thank you, Dr. Mignot and thank you, Dr. Jefferson for the excellent Q&A session. So right now let's switch to Dr. Westover. So he will talk about AI-generated full sleep report. So these are the questions for you, Dr. Westover. And so could you first briefly discuss, introduce to us what is AI-generated full sleep report and what does it include? Sorry, I was muted. Yeah, so I think you're referring to a project that we call internally, we call it Kaiser, the complete AI sleep report. And so, yeah, I'm happy to explain that. So this is a project that's kind of involves multiple collaborators. So we have a version one that we're working on together with Robert Thomas at Beth Israel and collaborate at Boston Children's. And then we have the final version, what we hope will basically be a replacement for, it'll be replaced doctors and technicians by computers stage two, that's longer term, working with Emmanuel and some collaborators at Emory, Marie Trotty and many others. Anyway, so, but what it is, is the process of course, of normally getting, if you go to the lab to get a sleep study right now, the usual process would be that a technician will review all of the data and painstakingly annotate it, right? Sleep staging, write down where the arousals are, annotate all the breathing events and categorize them and then mark the limb movement events and then either manually or by some computed rules, detect patterns in those. And then a physician reviews them later and maybe modifies them. So this whole process has a number of problems. One is it's just, it's slow and it takes a lot of labor, right? You usually have to devote a couple of full-time people doing this. A second problem is that because of that, in part, it's not very scalable. So many people who might have problems, and Emmanuel alluded to this before, you can't do widespread sort of screening for basic problems like RBD or even apnea, right? So there's a lot of problems being undiagnosed just because part of, there's this bottleneck in addition to other bottlenecks that make it hard to scale this. And then the last problem is that if you, actually, there's a, I like talking to Magdy Younis about this. He said, you know, he's one of the pioneers of sort of automated sleep scoring. And he said when they had a program that they felt was doing, I think, staging quite well, like he was convinced it was at least as good as people, when they were testing it, even if you have somebody critique the program and then give them the same results, same study a few weeks later, they would kind of change some of the results back and forth just kind of out of pride, right? Thinking we're better than the computers. So, I mean, so the problem of inter-rater variability is a kind of a big problem in human-based scoring of all these different events. So the AI sleep report is trying to basically, you know, reproduce what humans do, but do it for all the different tasks that someone, a technologist and a physician would normally have to perform, but in a way that can be verified to be at least as accurate as any two experts, you know, are compared to one another. So we want expert-expert reliability to be matched by algorithm expert reliability or surpassed for each of those tasks. So that's it. It's automating that whole process, taking the human initially, well, eventually out of the loop, but for now sort of making it much, much faster to get to that report. And then, you know, we would have humans overread it and kind of verify it to make sure that it's correct. Yeah, I agree. And yeah, so I think now the audience has a general understanding about this AI full sleep report. And the second question, so usually the people who actually use this kind of technique does not have enough background in artificial intelligence. And so what knowledge should they have to understand the pros and cons? And if it gives error, how should they deal with it? Yeah, so that's an interesting question. So I guess for, so I mentioned there's a version one and version two, right? So the, I think version one that we are almost done with now, it's still, this is one where humans should still be in the loop, right? Because all the, so, okay. So we're developing this version one on data from about around 30,000 people. Actually, it's more than that. It's maybe around 50,000 people. But if we include epidemiological cohorts in addition to the clinical cohorts that we have, it's not, that's a lot of people, but it's not clear that it really spans every possible case that you might want to, do sleep analysis on. So if we want a truly generalizable AI algorithm, one that we can sort of trust to be as good as any physician or tech, we would want it to be sort of trained on and tested on, not just trained, but in fact, tested on and verified to be accurate for every phenotype of interest. And that, so that's a lot, right? Spanning the whole lifespan from babies to old, to patients in their 80s and 90s. And then we need to verify that it works regardless of sex and race and then clinical phenotypes. So there's RBD, narcolepsy, there's going to be MCI and Alzheimer's disease. There's a wide range of diseases that might affect the performance. And until we really have a cohort, I think it's going to be much, we think it's going to need about 200,000 patients. That's version two is going to be based on that. Until we have that, I think there's always going to be a need for someone to have some understanding and be able to verify the results of these algorithms. So what knowledge do they need? Well, I mean, I think for right now, it's useful to have some understanding of just basic concepts like the idea that there is inter-rater reliability issues, right? So that there is, you know, anything that an AI algorithm produces could be wrong. And so it's always going to be necessary to have some understanding of how to score sleep. But in the future, right? I think we'll need less and less like detailed understanding of AI algorithms. So, I mean, and because just like when you drive a car, you don't really, hopefully we don't, it's not necessary to only let, you know, people who understand how to build a car drive them, right? To the extent that we're successful, we'll make it so you don't need to know very much, but yeah, that will be, I think that's in the future. So for now, I guess you need to at least know how to read studies. We should make the, you know, one of our goals and objectives in making the complete AI sleep report is to make it sort of self-explaining. So it will show you why it thinks something is a particular stage or a particular type of event. And so that you can, you know, disagree instead of just have to trust it as a black box. Yeah. Yeah. Thanks. So for the sake of time, I think question four is actually already answered. So how to validate, Dr. Westover mentioned, it has to be validate against interrater agreement. But I, so I still want to ask question three as the last question. So can automated sleep staging bring new knowledge to sleep medicine? Yeah. And I think Emmanuel partially answered this, right? With the work his group's been doing on narcolepsy and it sounds like on RBD as well. But the, I think, yes. I mean, there's information in the physiology. It's so rich. There's a lot more information in there than we currently can, are extracting or even probably can extract as people. So for example, I mean, there's work by Dr. Houchi-san that I know on extracting something called brain age, right? So how old does somebody's brain appear to be compared to their actual age? That has information that's relevant to underlying disease. We know that it's possible to predict life expectancy reasonably well from sleep. And you have, there's also some evidence that you can predict risk of future problems like risk for stroke and risk for depression and so on. So I think there's a lot of, yeah, potential to discover new information from the PSG that we will be taking advantage of as we move forward. Yes. Thank you. So I think question five will be discussed as it's a general question. So we'll be discussed later. Thanks Dr. Westover again. So let's transfer to Anuja. Thanks. Yeah, thanks Aki. All right. Well, welcome Mr. Tanyo. And I was really going to start it off by you actually brought all the products that we've been talking about now. You've brought it to bedside and you've had, you've really made that transition where you've got a fully functional product which is being used. So I'll jump right into the questions. And there are a few other nuances also which I want to talk about. So why don't we start off with you telling us how you used AI to improve these mask fittings and improve adherence? Yeah, so as you know, through the past number of years, through the past couple of decades, the number of masks in the marketplaces increased dramatically. And I think, although that's a blessing, it can also be a curse for our industry because there's so much data out there. And to sort through all that clutter of what makes a good mask versus a bad mask really stems from the fact that there's different types of masks out there too. There's the full face mask, standard nasal mask, nasal pillow mask. But now we have introductions of hybrid masks, under the nose mask. And when you try to expect all these clinicians, both new and old, to sort through this information, it's really hard to appropriate who these masks are good for. So what AI does is that it allows it to sort through all the clutter. And hopefully, if we utilize the technology effectively, we can now go beyond just sizing a mask to find the right mask for individuals. We can start to look at other elements such as pap data, demographic information, such as a person's ethnicity or their gender, or even their settings and their pap pressures, because we know that there's a tendency for certain masks to work better with higher pressures and lower pressures. And then we can ask pertinent questions that are important that the AI can sort through to appropriate that right mask for each individual. So the key is really matching a profile to a patient. And that's what I think our AI is able to do once we're able to query through thousands and hopefully even millions of data points in the future. And that will inevitably improve the outcomes for pap adherence and long-term compliance as a result. That is very commendable. But as you mentioned yourself, that then your AI software then is continuously learning. Yes. So even as every single patient uses it, the software itself learns from that. So do you think that's a limitation of your software or what would you say? I think it's actually the advantage of the software because of the fact that our ability to allow patient input as well as clinician input into the software already harnesses that information from those groups of individuals that are using these devices so that the software can now derive the information to create an outcome. And it's really important to note that when we think about data sets, right now we have a very limited amount of data in the marketplace. And that is what the challenge of our industry would be in terms of AI for the mass fitting space. So to just outright say, hey, it's 98% predictive accuracy right from the get-go is really not really fair to do at this stage. Having said that, if we continue to collect data in a manner where patients can input it in a simple fashion, so through our app, we're able to actually push it out there for patients to use on their mobile devices, on their computers. And so the deployment of the technology is quite simple. And so the input of those patients is also simple to obtain right away and in real time. Then the AI and the data set can be harnessed to create an outcome quite quickly. So when it gets into that domain, then do you feel at some point as, are you losing control of your product? Because if the product is learning something, do you still have that form of regulation or that form of control that what you're learning is right and what you're interpreting is right? I think that's where most of us struggle with when we think of AI, like that barrier between augmentative versus autonomous. Yes, and I think the way we have to look at it is that we do need that input from the patients. And when you look at it in terms of the data that's coming in, yes, it is live, but what we hopefully will be able to also allow patients to do is provide us data both positively and negatively, right? Because if you're only taking positive data, meaning you're only saying these masks are good, then it's hard to really say which masks have not worked for that profile. What we're able to actually allow patients to do in our system is to rate these masks that have been recommended by the system positively or negatively, so that those negative ones will be accounted for into the system and the data set and the algorithms just as well as the well-rated masks right away. That's a great feature. Now, coming to the third question then, now, when you're transitioning to a product which is AI-enabled, you're putting it in the marketplace, I'm sure it has its own challenges. Like you have a different whole set of skill sets that you need to implement that technology and feasibility, cost, I think those are some of the barriers and plus a general mistrust in the population. So what are some of the barriers that you have experienced? Well, there's two major challenges that we encounter. I think the cost part is actually quite nominal relative to the cost savings that most organizations can benefit from. So the two major challenges really stem from the fact that old habits die hard and we have experienced clinicians and non-experienced clinicians, but certainly with experienced clinicians, they prefer to only trust their own judgment, which is fine. I have no problem with that because I'm an experienced clinicians who've set up thousands of people on CPAP, but it's important to embrace technology for what it is. It's like using the app like a computer and a smartphone, rather than thinking of it as a job replacement. That's really how we wanna look at it because when we use these tools effectively, what we create is efficiencies and we try to reduce errors. So that's point number one. The second challenge I feel is really what I mentioned earlier is that there's always a fear with AI that it's a tool to replace people's jobs. And whether you're in the field of medicine or in the field of manufacturing, the fear is that, but my personal opinion with AI is if you use it correctly and you're able to enhance work efficiency, it will actually help you do the things that you're meant to do. Most AI at this stage are meant to manage or overcome these repetitive tasks that are nonetheless time-consuming for people. But if we have more time, but by utilizing AI, we free up more time so that we can do the things that we are meant to do, then we can actually do a better job at what we do. Because one of the things AI can't do is replace the human element of engaging our patients. And AI can't create, we can't create AI that creates meaningful conversations to provide education, support, compassion, empathy with patients. And these are things that AI can't do. So what we want AI to do is those basic tasks so that we can focus on these things that actually are the key to unlocking improved compliance and improved adherence for patients. And once we're able to embrace this in the future, then we can gradually evolve to other things related to mass fitting with the use of AI technology. It's very articulately said, like I would start all these medicine is an art, not just the science. So not to miss that art. All right. So with that, I will save the fourth question because we pose it to all the panelists. But I do have a few questions from the live audience. I want to ask specifically two. I do want to give a special shout out to Kathy Goldstein who is sending some amazing questions, but I'm actually going to skip hers before I go to the ones which are relevant to your talk right now. So Peter Oster says, MetaMason attempted to develop a problem of using AI for developing a mask for an individual patient and then use 3D printing to actually print a mask for a patient. Although they weren't able to get FDA approval, this methodology is being used in Singapore. Is this possible here in the United States? So when I've actually first started this whole investigative process of imaging, it was to actually design a 3D printed mask as well, custom made to a face. But one of the things I quickly realized is this was back in 2013, is a custom shaped mask to a nose is not as comfortable as you would imagine it to be. So if you made a perfectly shaped mask to the nose, it actually feels like a wetsuit on your body. It fits you perfectly, but it's not comfortable. I think one of the things that people don't realize when creating a mask is when we create masks with space, especially a standard nasal mask, and you have extra dead space and a reservoir for air to flow around, that actually adds to the comfort. So it's not just about 3D printing a mask for that purpose to the shape of your nose. There's so many elements that we have yet to learn. And I think where this technology can also take us is that when we collect enough data sets of the different profiles of the face, then we can do it in a way where we can now figure out what these tolerances are to make a comfortable mask in the future. And which kind of brings us to the next question by one of our audience here, that with mask issues, so many variables depend on the patient-specific and then which a human clinician easily identifies and how far is AI from that, which alludes to what you already talked about, like the AI is not necessarily able to replace us when mask is there. Yeah, and that's why the technology that was created was not, again, meant to just give a finite answer. So it's not like one plus one equals two with this device. It gives you a list of possible potential masks that could fit that individual. And then you can now harness that clinician's experience to add value to that selection. So you can have, let's just say, three or five choices to choose from. But then based on the human interaction with the patient, you can now say, you know what, there's five very good masks that can work for you. But I think based on my experience that this would be the best one for you, rather than that having that clinician start, especially an inexperienced one to start with a patient and say, well, I got 150 masks in my head and I really don't know which one works for you, but I was promoted this mask by this rep and he said it was good. So I think this one will work for you. I think it takes away that bias right from the get-go and gives you a profile for which to select from. And I think you're statistically gonna have a better outcome based on that list generated in the onset. Okay. All right, I have time for one last question before I hand it over to Dr. Jefferson. So how does this technology compare to satisfaction and success of in-person mask fit or CPAP compliance? Have you had a head-to-head comparison? Actually, we have in some of the clinics. So one of the things that has always been a challenge for us clinicians who has been in the field for a long time is that when we do mask fitting for patients, patients often wonder, well, out of all the masks that are out there, how do you know that one you just selected is good for me? And so obviously we lean on the fact that we say, well, based on our experience, this has worked well for lots of patients, but in the head of many patients is, how is it good for me, not your other patients? When we use this tool, it actually involves the patient through a process, and they actually see and feel a process that's being placed upon them that is standardized. And from that point on, and if you start to explain the fact that, look, based on the technology that we've been able to administer to you, the selection process indicates that these are likely going to be your best chances for success. They feel engaged in that process, and as a result, the confidence level increases, as opposed to the old way, again, of trial and error, where we have to lean on the messaging of saying, well, you know, try this mask, and if it doesn't work for you, guess what? We'll try something else next time, and we'll find one that works for you. That's not really a confidence builder, especially for new clients, but this process actually gives them that confidence from the get-go. So we're already better than 50-50 as a result. That's very interesting. Yeah. All right. Well, thank you so much for that, Dr. Jeff. I'll hand it over to Dr. Jefferson, and for the sake of time, do you want to start with the live Q&A first, or did you want to go over the question to the panelists? So thank you all so much for all the information you've provided already. There are two questions that we really want to make sure, for the sake of time, that we answer, and all the speakers, I'm hoping that you all can contribute to this, but what are the barriers you're facing to incorporating AI in sleep medicine? That's the first question we want you to think about, and then what solutions do you propose? So that's the second question. So we're going to just, if we can start with Dr. Mignot, if you have an answer to that, what are the barriers we're facing to incorporating AI in sleep medicine? You named some of them, and you made some statements earlier in your talk. And what solutions do you propose, if you want to continue building on that? Then we'll ask Dr. Westover and Tan Ngo. I see several issues. I think, as usual, the most important problem is not technological. It is more like legal and bureaucratic. I think probably Pundit would agree with me that technically, a lot of these AI programs are working, and they should be relatively easy to implement, but we need to make sure that they are well implemented, and that people will trust the result. And as such, I think it's very important that they are evaluated independently, that they are verified. But of course, this may take quite a bit of time, and I'm very happy that the ESM is taking a lead there, but I'm sure, of course, the FDA, et cetera, will need to do something as well. The second barrier was mentioned a little bit. It's just a general fear that people have with AI that maybe it could replace jobs. And I entirely agree with what was said. I think, in general, this fear of technology is overstated. Now, there are people talking about general artificial intelligence, that almost artificial intelligence could replace people. I think we are very, very far from there. And in general, I think AI would be just helping you to make better decisions and having new ideas. That's one of the reasons I gave the example of narcolepsy, where we discovered by scoring sleep stages, we could find something new. I think we still need the human creativity and this very unusual association to fully benefit from AI. And I think, in general, technology creates new jobs, and there will be even more jobs, probably. I think those are the two main barriers that I see. On my end, I think I agree with Dr. Mignot. The regulatory component is an important aspect to often have to overcome. And it's important that we also help our industry understand that there's different types of AI out there. And the way that we obtain data is of a different nature. For example, in the space that I function in, people often ask me, well, how are you able to collect this information? And are you taking images of our face and storing it in a data set? Well, part of our patent-pending technologies are such that we're able to collect data without actually obtaining images, and we just take reference points. So that's very key for us. And that's the security component of things that people are often fearful of. And then the second thing is, what are we doing with the data set? Or how are we managing this data set to create an outcome that you can trust? And the challenge sometimes with any type of algorithm is it can be influenced by human intervention and bias, right? And you can influence it in many different ways, whether it's a pay-for-outcome model or anything like that. But that's where I think we have to be very transparent. And maybe it's bodies like the AASM or any other governing body that will start to look at the different AI that's out there. And so that you can squash some claims that are being made of completely unbiased, agnostic type of approach to making an outcome available to individuals. Because the claims are such that it is agnostic and unbiased, but really we know by the nature of the business that there are biases that are created intentionally or unintentionally as a result of it. There's two things I realized. There is another obstacle that's very important, is heterogeneity of technology. I think that makes it very hard for AI because we have different manufacturers, different devices, et cetera. And of course, if you try to generalize, even for PSGs, if you try to generalize a program like detecting narcolepsy from many, many data sets, of course it's going to work better on all these data sets, but maybe the performance will be reduced a little bit if not the quality of the data is not perfect. AI cannot create high quality data. So we also have to make sure that there is also a standard in terms of what kind of data is being collected. I mean, now we see that with all these portable devices. You may measure actigraphy in many different ways with different resolution and AI will not be able to solve that. And if we try to diagnose narcolepsy with actigraphy at five minutes resolution, that's totally stupid. So I think it's also very important that we keep watch on what all these new devices that are coming on the market are going to bring as information. And we need a minimal standard for AI to apply to these data sets. I think that's another very important problem we're facing, the heterogeneity of the data. To piggyback on that, that's why it's very important that we're inclusive of all devices, but also understand that there are limitations to certain devices and there's benefits to certain devices. But we can try at the backend, hopefully standardize the data sets that are consistent. And then from there, start to derive solutions and outcomes based on it. I would say that I have a little bit different perspective. I think that personally, I don't think that AI technology... I mean, we can't make a blanket statement, AI is either ready or not. I mean, AI is not like one thing, but in terms of just automated analysis of say standard PSG data, I don't think that the people should just trust it yet. I think there is actually quite a bit of work to do. And the main barrier, again, I think is that we've never had a large enough and heterogeneous enough set of training and testing data, nor rigorous enough evaluation against certifiable experts to convince everybody that it really performs on par with... That you could just replace a person with it. So I think until we do that, nobody should fully trust it. We actually do need to do that work. And I think this is related to a very legitimate concern that I think public and then especially clinicians have, which is overhype. So it's easy to get really excited about things and there's an incentive to publish papers fast rather than rigorously. Those are often at odds with each other. So anyway, I do think that we have some... Although I agree with a lot of what's been said, there's value in what already exists now, a lot of value and it could help us, but we have some rigorous scientific work still to do before I'm ready to say, yeah, okay, the computers can do this part of my job and I could focus on the parts that are maybe they can't do. I still think that there's a lot to do. Independent validation is certainly very important and we're actually trying. Yeah. So that's definitely one part, independent, but it's not just the independence that matters. It's sort of the scale and heterogeneity of the evaluation. You need to evaluate on every cohort that you think is of interest before we just say, okay, here's general purpose, not general AI, I'm just talking about general purpose, say sleep analytics. But I think Brandon, that's absolutely correct. And in fact, that's one aspect that I mentioned briefly, which is really tough is a generalization versus being more specialized. So that's where I think we have a little bit of an issue at some point is AI may work very well in a certain data set when it's well standardized, but then as soon as it becomes more and more heterogeneous, then it can start to make a mistake. And of course, if you try to do a system that works on everything, even bad data, you may have a performance that's going to be much more limited that you could get in a very good data. So that's another big problem is generalization is good, but it may also reduce performance. You may end up having some things that work less well that it could on a better data set. And I think we also need to figure out a way to solve that issue. But it's feasible, but it's going to take time. And that goes exactly in the direction you were saying, very large data set to be applicable with different level of confidence, maybe. And I'm going to have to stop us for one moment for the sake of time, because it is 1.01 Eastern Standard Time. But Matt, are there any pressing questions in there? Because I see a few of our participants have to leave, but those who may want to listen to some of the answers to the questions can remain. Matthew? There's a few. The general gist of a few of them in common is a question about whether or not there is an industry standard when we define AI and independently validated approaches out there for AI. I think the question about what AI is, how it's defined, I mean, AI is just any computer program that does something ordinarily a person would do. So it's a very broad term, and it's probably a bit of an overhyped term. But what was the other part of the question was, is there some independent validation on the performance of AI that we can trust, third party? So I think that again, that's sort of like, for each task that needs to be performed, you have to have some different way of validating. So I think for mask fitting, it's one thing. For sleep staging, it's another. For event detection and classification, it's another. I think ASM is doing some work to try to make that possible with the ISR data sets. We had a conversation before this all started about how to get access to that and test on it, and I think that'll be a positive thing. But I do think we actually, there's one of my comments was about this sort of limitation in terms of what data sets exist for doing validation. Right now, no data set exists that sort of say, so just even for sleep staging, you need a multiply scored set of data for the full night so that it really represents what needs to be done from all kind of phenotypes of interest before we can say yes to that question. And that doesn't currently exist. Same thing can be said for all the other tasks, I think. So building these data sets is a major sort of problem in front of us. And I'm going to go ahead and close for those panelists who have to leave because I do notice a number of them do need to depart. But thank you so much to our panel and thanks to everyone who participated today. Please stay tuned for other upcoming webinars offered by AASM on sleep medicine. And in about two weeks, there will be an email sent with instructions on how to access this webinar recording on demand. So if you have to go, we understand. But thank you very much for being here today. Thank you. Can the panelists remain on for about five more minutes, please? Thank you so much. Thank you. Thank you. This is XLAI. Yes, it's a kind of AI, especially when you automate the calculations, right? Sometime I wonder if I'm not artificially intelligent, too. No, it's true. AI is a really bad term. I think deep learning is more helpful. I think so, too. That's also not a good term. I mean, Yeah, it's a bit better. At least people know what it is. It's not networks, you know. I think a lot of times people associate AI with Terminator. That's why people have this embedded, you know, for the brain. And I think it's a good term. Embedded, you know, fear of it. So deep learning is a particular approach, right? But that's not what matters. It's what problem is being solved. It's not like sort of inherently better if you do something with deep learning, you know, if you get the same results. And, you know, maybe it's faster or more interpretable with some other approach. I mean, so I think we should be more problem focused than approach focused. But deep learning is just so popular that it kind of gets you in for free, essentially, if you just say that word. Oh, no, I was not talking about that. I mean, at least it has a definition. Artificial intelligence is, I mean, deep learning is at least neural networks and a certain architecture. I think artificial intelligence can be anything. Yeah, it's very broad. That's what I meant. It's a little bit more restrictive. But you're right. I mean, in any case, it changes all the time. It's very funny. Actually, I try to look at some, because now there's generative AI, of course. And I try to look at all the, you know, the best term and try to see how it's impossible. You have the CNN, the LSTM, and the bidirectional, and now your transformer. And now it's becoming so complicated. You really can't even define all the best terms. So let's try to answer one more question. Matt, is there another pressing question? Because we still have 33 participants on who are very interested in hearing your perspectives and answers to some of them. Yeah, there's a few on wearables. That's sort of in the future, too. And it's not just home-based type 1 or type 2 PSG, but the types of home-based technology, including wearables and nearables, and what your thoughts are on how that can play a role in the future. I think, I mean, wearables are, so again, it's a big collection of very different things. So it's hard to make blanket generalizations. But I do think, but nevertheless, in general, I think that's the way we need to go. I agree with Emmanuel. We want to be moving from having people need to come into sleep lab to being able to do everything you would do in a sleep lab at home. And part of that is better algorithms. I think a large part of that is better algorithms. I mean, I think there's a lot of devices that exist now that allow you to collect essentially very close to the same quality of data at home, but also better devices, right? So there are things that you cannot do. You cannot measure, even though they may manufacture, may claim otherwise, say with actigraphy, right? You can't measure, for example, the brain age, or you don't measure EEG with that. So, but yeah, I mean, that's happening now. I don't think it's too far in the future. Yeah, no, I agree. And I think as I tried to mention, I am a big believer in what I, it's called multi-modal learning, that I think you really get a lot more bucks for your bank, for your bucks, you know, if you try to use completely independent modalities, because, you know, one sense, you can see each of these modalities doing like only one of your senses, you know, if you have vision, hearing, nose, you know, you have a much better idea of what is in front of you, that if you use only one modality. And so when you ask one question, or use actigraphy over a long period of time, plus EEG, one night, that's so much more powerful than using the same thing over and over. Yeah. And I also think it's important to always take into account that a real life environment is, you know, when you're able to do it, it's just more useful as a data set. To answer that question exactly specifically, I mean, clearly narcolepsy and hypersomnia, the movement is going to be to try to measure something all day long, so that you see people actually do fall asleep during the day, not just your night's sleep. I mean, it doesn't, it's a day disorder, you're unable to stay awake. So we need to actually see that people are having micro sleep or that they have some kind of abnormal EEG reflecting brain fog. So I think you're absolutely right. That's the other very important thing is to try to study sleep and wake in the natural environment. The natural environment, yeah. Because we can't mimic it in any lab setting. Right. And that does mean, right, sort of understanding the, what you can and can't measure with the different modalities, right? Longer term, you can't get people to wear an EEG all the time. It would be really, you know, inconvenient, but you could do actigraphy or one of the screen or watch-based things for, you know, for very, you know, months if needed. So that plus intermittent, you know, high density sort of physiological measures is probably the way to go. But it's all going to depend on exactly what clinical questions you're asking. I mean, in the space of mass fitting where I fit in, you know, sometimes I even often, you know, think about things such as obviously pillow height environment, like how they move in their natural bed and even, you know, complete atonia of the face. I mean, that is something that we actually ignore when we do mass fitting because, and the reason why I thought of that initially was even when the morphology changes, when you wear dentures versus no dentures. So all of these things factor. And I think if we ignore that and just do a mass fitting in clinic, then we're missing a whole subset of variables that actually influence the outcome later on. And there's one last thing that we haven't been talking much about. It is the possibility of feedback, you know, and like, you know, if you, for example, your mask, you could potentially change a little bit the pressure depending on your position. You could have an antelusion system or detect an apnea and do something. And I mean, that's going to be, take longer, but I think that's also another powerful area of development. Yeah. If we can actually link the data to compliance and mass leaks and stuff like that, we can really improve the outcomes too. And in terms of feedback too, with regards to the quality of the mass that that's being recommended, sometimes right now are big challenges that the outcomes of what makes a good mask is determined by what mask was sold. Not really what the reaction of the patient truly is towards that mask. Right? So it's like, oh, we sold this mass. So it must be a good mask, but really if the long-term compliance isn't there and the patient discontinued therapy within three weeks or a month after acquiring the mask, then obviously it wasn't a good mask. So we need that feedback from the patient, not just from the providers. So. Thank you again to our panel and thanks to everyone who participated today. Please check your email in about two weeks for instructions on how to access this webinar recording on demand.
Video Summary
The panelists discussed the barriers and potential solutions for incorporating AI in sleep medicine. Some of the barriers mentioned include the need for independent validation, concerns about job replacement, regulatory challenges, heterogeneity of data and technology, and the fear of AI technology. The panelists agreed that there is a need for independent validation to ensure the accuracy and reliability of AI technology. They also suggested the need for standards and guidelines for the use of AI in sleep medicine. In terms of solutions, the panelists emphasized the importance of transparency, data collection, and standardization of data sets. They also highlighted the need for continued research and collaboration to improve the performance and applicability of AI in sleep medicine. The panelists acknowledged that AI technology has the potential to improve outcomes in sleep medicine, but emphasized the importance of careful and responsible implementation to ensure patient safety and trust.
Keywords
AI in sleep medicine
barriers to incorporating AI
independent validation
job replacement concerns
regulatory challenges
heterogeneity of data
fear of AI technology
standards and guidelines
transparency in AI
data collection
patient safety and trust
×
Please select your language
1
English