Forecasting Insurance and Patient Charges using Linear Regression
Jayakrishna Natarajan1, Krishanu Das2, Ronak Harish Patil3, R B Sarooraj4

1Jayakrishna Natarajan*, Department of Computer Science & Engineering from SRM Institute of Science & Technology, Chennai, India.
2Krishanu Das, Department of Computer Science & Engineering from SRM Institute of Science & Technology, Chennai, India.
3Ronak Harish Patil, Department of Computer Science & Engineering from SRM Institute of Science & Technology, Chennai, India.
4Mr. R B Sarooraj, Assistant Professor, Department of Computer Science and Engineering, SRM Institute of Science and Technology, Chennai, India.
Manuscript received on February 10, 2020. | Revised Manuscript received on February 20, 2020. | Manuscript published on March 10, 2020. | PP: 1077-1081 | Volume-9 Issue-5, March 2020. | Retrieval Number: E2098039520/2020©BEIESP | DOI: 10.35940/ijitee.E2098.039520
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The insurance companies around the world work with very simple formula and have a very specific agenda. They convince people to deposit money on their name to the insurance company, in return the people are promised to be given a large sum of amount when they get an expensive hospital bill or when they meet with an accident. This amount to be paid, is generally taken from people on a monthly basis. Customers are convinced to join such a scheme as it is very tempting and the prospect of money troubles taken care of for nothing in a time of crisis seems wonderful. Insurance companies on the other hand pray that nothing happens to the customers or their families, so that they don’t come looking for compensation. The money that they collect from new insurance holders is what they use to pay of the losses. Data analysis is the process of understanding the behaviour of a certain dataset when measured against certain static quantities. In this paper we are proposing to use Data science and in particular regression analysis, to analyse a dataset of patients and devise a method to predict their insurance amount. There are various types of learning and broadly speaking linear regression comes under supervised learning. We have a dataset consisting of over 1300 patients each with 7 characteristics like smoker or not, do they have children, their age, sex, BMI, etc. We are also proposing to devise methods to overcome the shortcomings of Linear regression like multicollinearity and homoscedascity, and perform the required data cleaning.. 
Keywords: Data Science, Linear Regression, Learning, Analysis, Attributes
Scope of the Article: Regression and prediction