Tools in Data Science - May 2025#

Tools in Data Science is a practical diploma level data science course at IIT Madras that teaches popular tools for sourcing data, transforming it, analyzing it, communicating these as visual stories, and deploying them in production.

This course exposes you to real-life tools

Courses teach you programming and data science. From statistics to algorithms to writing Python code to building models.

But one critical subject that’s rarely covered is: what tools should I pick and how do I become proficient in them?

These tools might not help your CV much. But they will make things easier in real life. For example, at school:

  • You learn from pristine datasets. But in the industry, you’ll have to scrape them yourself.
  • You learn how to train models. But soon, you’ll just pick something from HuggingFace.
  • You learn to write a log parser over weeks. Instead, your boss writes a sed + grep script in minutes.

“We lost the documentation on quantum mechanics. You’ll have to decode the regexes yourself.”

In this course, we’ve curated the most important tools people use in data science.

Learn them well. You’ll be a lot more productive than your peers.

This course is quite hard

Here’s students’ feedback:

  • 2 out of 5 students in the Jan 2025 batch failed
  • It used to be an easy course until 2024. # # #
  • Now it’s hard and covers more. Take it in your last semester if possible. # # #
  • Plan extra time. It takes more time than typical 3-credit courses. # # #
  • LLMs grade you – unpredictably. # #
  • The ROE is hard. #

Take Graded assignment 1 to check if you’re ready for this course. Please drop this course (do it in a later term) if you score low. It’ll be too tough for you now.

But it's probably worth it.

Here’s students’ feedback:

Programming skills are a pre-requisite

You need a good understanding of Python, JavaScript, HTML, HTTP, Excel, and data science concepts.

But isn’t this a data science course? Yes. Good data scientists are good programmers. Data scientists don’t just analyze data or train models. They source data, clean it, transform it, visualize it, deploy it, and automate the whole process.

In some organizations, some of this work is done by others (e.g. data engineers, IT teams, etc.). But wherever you are, some of the time, you need to write code for all of this yourself.

This course teaches you tools that will make you more productive. But you do need programming to learn many of them.

If you passed, don't enroll again

The course is public, so you can always audit it.

Also, registering again for the course does not improve marks much.

We encourage learning by sharing

You CAN copy from friends. You can work in groups. You can share code. Even in projects, assignments, and exams (except the final end-term exam).

Why should you copy? Because in real life, there’s no time to re-invent the wheel. You’ll be working in teams on the shoulders of giants. It’s important to learn how to do that well.

To learn well, understand what you’re copying. If you’re short of time, prioritize.

To learn better, teach what you’ve learnt.

We cover 7 modules in 12 weeks#

The content evolves with technology and feedback. Track the commit history for changes.

  1. Development Tools and concepts to build models and apps.
  2. Deployment Tools and concepts to publish what you built.
  3. Large Language Models that make your work easier and your apps smarter.
  4. Data Sourcing to get data from the web, files, and databases.
  5. Data Preparation to clean up and convert the inputs to the right format.
  6. Data Analysis to find surprising insights in the data.
  7. Data Visualization to communicate those insights as visual stories.

Anyone can audit this course#

Everyone has access to:

You can solve these questions any time and check your answers before the submission dates.

Only enrolled students can participate in Discourse, get project evaluations, take the final end-term, or get a certificate.

Those auditing can join the TDS 2025 May Google Group for announcements.

Evaluations are mostly open Internet#

Tentative dates:

ExamTypeWeightRelease DateSubmission Date
GA: Graded assignmentsBest 4 out of 715%
Graded Assignment 1 - DiscussOnline open-Internet MCQThu 01 May 2025Sun 18 May 2025
Graded Assignment 2 - DiscussOnline open-Internet MCQThu 05 May 2025Sun 25 May 2025
Graded Assignment 3 - DiscussOnline open-Internet MCQFri 20 May 2025Sun 01 Jun 2025
P1: Project 1 - DiscussOnline open-Internet20%Fri 16 May 2025Sat 18 Jun 2025
Graded Assignment 4 - DiscussOnline open-Internet MCQWed 14 Jun 2025Sun 22 Jun 2025
Graded Assignment 5 - DiscussOnline open-Internet MCQWed 21 Jun 2025Sun 29 Jun 2025
Graded Assignment 6 - DiscussOnline open-Internet MCQWed 02 Jul 2025Sun 06 Aug 2025
ROE: Remote Online ExamOnline open-Internet MCQ20%Sun 20 Jul 2025 13:00Sun 20 Jul 2025 13:45
Graded Assignment 7 - DiscussOnline open-Internet MCQMon 11 Aug 2025Sun 20 Aug 2025
P2: Project 2 - DiscussOnline open-Internet20%Fri 11 Jul 2025Wed 13 Aug 2025
F: Final end-termIn-person, no internet25%Sun 31 Aug 2025Sun 31 Aug 2025

Updates:

  • GA1 submission date postponed from 11 May to 18 May 2025
  • GA2 submission date postponed from 18 May to 25 May 2025
  • GA3 submission date preponed from 01 Jun to 25 May 2025 since there’s a break the week after
  • P1 submission date postponed from 14 Jun to 18 Jun 2025
  • GA4 release date delayed to 14 Jun 2025
  • GA5 release date delayed to 21 Jun 2025
  • GA7 release date delayed to 04 Aug 2025
  • GA7 submission date postponed to 17 Aug 2025
  • GA6 submission date extended to 6 Aug 2025
  • P2 submission date extended to 10 Aug 2025

Notes#

  • Graded Assignment 1 checks course pre-requisites. Please drop this course (do it in a later term) if you score low. It’ll be too tough for you now.
  • Remote exams are open and hard
  • Final exam is in-person and closed book. It tests your memory. It’s easy.
  • Projects test application. The projects test how well you apply what you learnt in a real-world context.
  • Bonus activities may be posted on Discourse. See previous bonus activities
  • Evaluations are mostly automated. This course uses pre-computed (for objective) or LLMs (for subjective) evaluations.
    • LLMs will evaluate you differently each time. Learn to prompt them robustly to get higher marks.

Constantly check communications#

Check these three links regularly to keep up with the course.

  1. Seek Notifications for Course Notifications. Log into seek.onlinedegree.iitm.ac.in and click on the bell icon :bell: on the top right corner :arrow_upper_right:. Check notifications daily. Portal Inbox
  2. Your email for Course Announcements. Seek Inbox are forwarded to your email. Check daily. Check spam folders too.
  3. TDS Discourse: Faculty, instructors, and TAs will share updates and address queries here. Email [email protected] cc: [email protected] if you can’t access Discourse.

People who help you#

Their job is to help you. Trouble them for your slightest doubts!

Past Course Content#