Data preprocessing is a step in the data mining and data analysis process that takes raw data and transforms it into a format that can be understood and analyzed by computers and machine learning. When we talk about data , we usually think of some large datasets with some rows and columns. While that is a likely scenario, it is not always the case — data could be in so many different forms: Structured Tables, Images, Audio files, Videos , etc. Machines don’t understand text, image, or video data as it is, they understand 1s and 0s. Real-world data also contains noises, missing values, etc. which cannot be directly used for ML models. Hence, data preprocessing is required for cleaning the data and making it suitable for an ML model which increases the accuracy and efficiency of the model. It involves the following steps: Getting the Dataset Importing Libraries Importing Dataset Data Quality Assessment: i) Finding and Processing Missing/Inc...
Comments
Post a Comment