Breast Cancer Classification

Gaurav Patil
6 min readJun 9, 2021

Breast Cancer:

  • What is Breast Cancer?

Breast cancer is a disease in which cells in the breast grow out of control. There are different kinds of breast cancer. The kind of breast cancer depends on which cells in the breast turn into cancer.

  • The most common kinds of breast cancer are:-
  1. Invasive ductal carcinoma: The cancer in which the cells grow outside the ducts into other parts of the breast tissue,is refered Invasive ductal carcinoma
  2. Invasive lobular carcinoma: The Cancer in whcih the cells spread from the lobules to the breast tissues that are close by,is refered as Invasive lobular carcinoma
credit: VeryWell / Joshua Seong
  • Breast Cancer Detection:
  1. Using Mammogram:A mammogram is an X-ray of the breast.
  2. Using MRI: A breast MRI uses magnets and radio waves to take pictures of the breast
  3. By clinical breast exam: A clinical breast exam is an examination by a doctor or nurse, who uses his or her hands to feel for lumps or other changes.
  4. By Breast Self-Awareness: Changes that patient notice.
  • During this post we will make use of Clinical breast exam data i.e. in comma seperated value to find the result of a patient, using deep learning approach.
  • The Dataset can be found at : Github Link . The Dataset is obtained form kaggle competition, you can find more about the datset there.

Deep Learning:

  • Deep Learning is the Subfiled of Machine Learning,inspired by the structure and function of the brain called artificial neural networks.
  • In early talks on deep learning, Andrew described deep learning in the context of traditional artificial neural networks as:

Using brain simulations, hope to:

– Make learning algorithms much better and easier to use.

– Make revolutionary advances in machine learning and AI.

I believe this is our best shot at progress towards real AI

Google Colab:

According to the google research website, Colaboratory, or “Colab” for short, is a product from Google Research. Colab allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education. More technically, Colab is a hosted Jupyter notebook service that requires no setup to use, while providing free access to computing resources including GPUs.

The Complete Code for Breast Cancer detection using Deep Learning along with dataset used in this blog is available at : Github Link

Detail Discription of Code for Classification :

  1. The Google Colab Look as Follows in image 1:
image 1 : Google Colab

2. Upload the Dataset in Google Drive of the same directory :

image 2: Google Drive

3. To access the files in the google drive you need to autenticate it . Run this code in image 3 to Authenticate Colab to use the files in Google Drive.

image 3

After running the code a link will be generated. Click the link , allow colab to use the file in selected drive. After Allowing , token will be generated, fill the text box below the link and hit enter.

4. Locate the folder that contain the dataset as in image 4:

image 4

5. Import the necessary libraray:

image 5
  • Pandas libraray: Pandas is a python library written for the Python programming language for data manipulation and analysis.
  • Numpy libraray: NumPy is a Python library used for working with arrays.
  • Matplotlib libraray :Matplotlib is a plotting library for the Python programming language.
  • Seaborn libraray; Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

6. Run the code as show in image 6 in your colab notebook to read the dataset and show it.

image 6
  • Pandas libraray is used to read the csv file.
  • The pandas function “read_csv” is used to read the .csv file.
  • The Dataframe called “data” is generated .
  • Dataframe is displayed using head attribute of dataframe.

7. The feature and Label are generate from the dataframe using the code in image 7.

image 7
  • X is an array containing the feature of the dataset.
  • y is the array containing label , wheather the feature array has breast cancer(M) or not(N)

8. The Categorical data in the y is encoded using LabelEncoder from sklearn libarary as shown in image 8.

image 8

9. Dataset is Splitted into the Training set and Test set using the train_test_split function of sklearn into the ration of 80:20.

image 9
  • 80 % of the dataset is used for trainig purpose
  • while rest i.e. 20% is used for testing purpose.

10. Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. The scaling of the feature is carries out using the StandardScaler function in Sklearn library. As shown in image 10.

image 10

11. The shape of the label array and feature array while training are in image 11.

image 11

12. Creating a Deep Learning Model to train on the feature and label dataset to learn and classify breast cancer.

image 12
  • Necessary library are imported used while deep learning model to create
  • The Layer used in this deep learning model is Dense with unit 16, Dropout with 10 %, Dense with 1 unit and sigmoid activation function.
  • The optimizer used is Adam
  • The Loss function is calculated by binary crossentropy
  • The metrices used is Accuaracy
  • The Classifier is created with the layers and compiled with the mentioned optimizer, loss and metrices.
  • The classificer is fitted with the feature ad label array wih batch size of 100 and trained till epoch 150.
  • At the end of the training the accuracy optained is 99.10% and loss of 0.0519

13. Implemented a neural Network solution and improved the acurracy of breast cancer classification to the greeks-for-greeks article which has an test accuracy of 96.51%.

14. The Deep Learning Model was implemented on Test Dataset , the accuracy obtained was of 98.24 %.

Thus succesfully classsified the breast cancer as Malignant and benign using the kaggle dataset with an accuracy of 98.24%.

The Complete Code for Breast Cancer detection using Deep Learning is available at :Github Link

Thank You for your time.

--

--

Gaurav Patil

MS Western University | Student Researcher | NLP | CV | DL | Python