top of page

Diagnosing Pneumonia Using Deep Learning

I personally got diagnosed with pneumonia during my early teen years. I can still vaguely remember being rushed to the hospital from my boarding school because the chest pain, coughing and difficulty breathing that I had endured for a couple of weeks had got way too intense. So intense that I thought I was having an asthmatic attack but at least I got to see the family and spend some time at home but the log loss was definitely not worth it * stats joke *. On a more serious note, you might think pneumonia is not life threatening, and for some that’s true, but it is still claiming approximately one million lives each year. Most of whom are children under 5 years and senior citizens over 65 years. In 2016, about 1 in every 6 childhood deaths are as a result of pneumonia so it’s obvious we have a silent deadly problem on our hands here.

There are four types of pneumonia; walking, bacterial, viral and chemical. The disease attacks the lungs and inhibits the patient’s ability to breath and so it can definitely lead to death if left untreated. Some of the symptoms are fever, mucus, chest pain, cough, breathlessness, etc. The good news is that most cases are 100% curable within 4 weeks if diagnosed early. This is why we need advanced technology. To detect signs of pneumonia early, as opposed to the traditional naked eye judgment. About 100,000 pneumonia cases result in fatalities every year due to misdiagnosis so we need to do something about this since technically speaking humans are directly responsible for these deaths as a result of misdiagnosis. Either the doctor or specialists made a false negative or wrongfully diagnosed the illness as some other infection, say COVID-19 for instance. Either way, a false negative diagnosis will lead to an increase in the severity of the infection and an increased chance of a fatality occurring. Given what we know now, I’ll say it’s time to check our human egos at the door and turn to The Machines for answers.

When I found out the alarming amount of fatalities due to the misdiagnoses, I got curious and decided to dive a little deeper in my research. I found out that This misdiagnosis is mainly caused by the lack of an experienced specialist in the community, as it was the same in my case where I had to travel all the way back home just to get some tests done. The misdiagnoses can also be caused by errors due to rushing patients through examinations. I would imagine this would be more prevalent now, during this pandemic where hospitals are crowded due to a, coincidentally, lung disease.

In last few years, there has been amazing breakthroughs in computer vision technology, mostly driven by Deep Neural Networks, and convolutional networks. Many architectures have been proposed during the improvement and today we can get accuracies of over 90% when performing object detection. The Project ImageNet is big contributor to this innovation, providing approximately 40,000,000 images that were labeled by actual humans along with the incorporation of GPU and computational power that can withstand training these networks.

There are two stages in the convolutional network, one responsible for features like boundaries between colors, edges, curves, etc. The other stage helps detect textures, shapes, etc. The next stage detects more complex entities while the last stage is where the actual classification occurs at. This is the level where the model interprets the entity given as what it thinks is closest to it in a form of a label or name. These convolutional networks require huge amount of processing power, data and space to train on normal computes.

Using transfer learning, I’ll be taking a network with high performance like VGG16 or Resnet50 and apply it to my x-ray classification problem since the network has already been trained to classify a wide variety of entities. The network used is VGG16. It’s known for having pretty accurate scores so I have no doubt it would work perfectly for my problem.

First, I’m going to import all the necessary libraries that are available like keras, ImageDataGenerator, VGG16, etc. After this, I’ll make functions to plot the confusion matrix, calculate the metrics data, plot the ROC curve, the learning curve and variables for image size path, epochs, batch size and then for the train and test dataset from Kaggle. Following that, I imported my VGG16 transfer model and set the appropriate weights for the type of images in the dataset and set the Include Top parameter to false. This will ensure that the last layer is drop and I did this because I don’t want to classify thousand different categories when my specific problem only has two categories. So, for this I skip the last layer. The first layer is also dropped since I can simply provide my own image size as I did.

Before moving forward, since my data has already been trained, I’ll have to make a for loop to tell the model that all the layers should not be trained otherwise the weights captured would be changed and that’s not good. Next, I’ll make a flatten layer, and then add my last layer to it. When adding that last layer, I’ll input the length of the train folder which basically just means the total number of categories or classes I’ll have in my output layer. We also use SoftMax as the activation technique before compiling everything in the model using Adam as an optimizer and applying category cross entropy to the model.

The next thing to do would be to upload the train dataset into the model. I do this by using ImageDataGenerator to create additional dataset to help our modeling training. This will allow the network to see more diversification withing the dataset without any reduction in how representative the dataset for each category is during training. I won’t do the same for the test dataset as I won’t want to tamper with the data that I’ll be validating with. My parameters here are; a re-scale value of 1/255, a shear range of 0.2, a zoom range of 0.2, and I set the horizontal flip to True. After that, I inserted some images using flow from directly. My parameters are; 32 images should be used for training at a given instance (batch size), my image size is 224X224 and the class mode is set to categorical. I go on and apply the same parameters I used for my training dataset to my test dataset and then I call my fit generator. This will take about an hour to run because it a high processor is needed to make it run faster.

The accuracy of our model turnout to be 90% on the first try so definitely not bad. This number can be defined as the percentage of times the model correctly diagnosed a patient. The model loss is 0.2 out and this is the amount the model penalizes for incorrect predictions ~ 10%. The recall percentage is 90% and this is the probability of the model diagnosing a correct positive diagnosis out of all the times it diagnosed positive.

The recall score will be the main metric for this project since it’s the most important metric in medical problems given that doctors will rather make a wrong positive diagnosis than make a wrong negative. The AUC score is 0.89 and this is the average probability that the model can diagnose each X-ray image correctly. My future plan for this project is to create a classifier two different shade between pneumonia, X Rays from other long infections like COVID-19, Tuberculosis, etc. I also want to advance the classifier the detect the exact region of the lungs where the infection is located. For now, health professionals are welcome to integrate this model into their medical software’s to help them correctly diagnosed pneumonia. Please carryout personal verification of my results prior to implementation.


• What Causes Pneumonia in the Elderly? (2017, October 28). Retrieved from pneumonia-in-the-elderly/.

• Misdiagnosis of Pneumonia: Passen & Powell: Chicago Injury Trial Lawyers. (2018, May 08). Retrieved from pneumonia/.


Post: Blog2_Post
bottom of page