Golden Rules of Data Science
- tirthankarghosh5
- Jun 17, 2022
- 2 min read
Updated: Jun 19, 2022
A Post By: Tirthankar Ghosh
Starting in the field of data science always seems pretty daunting to beginners. What steps to follow, how to start, which resources to look out for, who are the whiz to follow and look up to in the field, when is the right time to start Kaggle competitions and so much more!
If you are looking to break into the field of data analyst, my previous article will really help you here. It’s for absolute beginners. You should try reading that up. It's pretty basic, 5 mins tops!
Now coming back to our topic, no matter what your level of expertise is in this field of data science, there are some rules that all data scientists will agree to no matter what. It will be wise to remember these when you are starting out so as to gain an upper hand in the field.
1. Data will NEVER be clean
No matter how good your client is or how good the data source is, the data that you will get will never be clean. You will have to use various forms of data cleaning techniques. If you are a python programmer, you already know how much NumPy and pandas help in this regard.
There will be times when there will be null values, inconsistency in dates, the format of data in various rows might vary, or worse data you end up getting might be corrupt. In these times, you will have to use various data cleaning techniques before starting operations on your data.
2. Never underestimate the value of regression models!
Simple models such as Linear or Logistics Regression will be good enough for the majority of the problems. You don’t need neural networks to solve every problem.
3. SQL and EXCEL are your best buddies!
Don’t underestimate the power of Excel and SQL — they are still two of the most useful tools for data analysis. I have seen so many beginners data scientists eagerly jumping into Machine Learning models thinking they will use simply this in the job. They have no knowledge of databases or SQL.
4. You can’t master everything!
5. Power of Story Telling
No matter how good of a model you used, how well you got your results, all this won’t matter if you can present your findings well to all the stakeholders like your seniors or clients that have no idea of the model you are deploying or the high level of precision you got. It is extremely important to make sure that you get your message out effectively and clearly.















Comments