Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Python

Huston Petty
Huston Petty
2,833 Points

Python Databases VS CSV/Excel Files

Hi there,

I'm not sure if this kind of question is allowed or not. I am currently going through the Python Databases course and I just need to be sold a little more on using Databases in Python.

I am already quite comfortable in Python. I create tools at work with it all the time. Recently I have started building in a logging/database system into all of my tools to have a record of when my tools are used, who used it, time taken to run, files processed, etc. I do this by just creating a "data.csv" file and having the tool add a new row everytime the tool is ran with all of the desired info.

So my question is...what is the benefit of using a Database over a CSV file (specifically through a Python script)? It seems a little more out of the way and complicated than just using a CSV file. Especially since I can easily open a CSV in Excel and get sums and averages of different columns of data.

Make no mistake I am not questioning that Databases are not useful in general, I am just wondering if it is useful to use it for Python scripts. Thanks in advance.

2 Answers

Jennifer Nordell
seal-mask
STAFF
.a{fill-rule:evenodd;}techdegree
Jennifer Nordell
Treehouse Teacher

Hi there! Of course, this question is allowed! But I think there might be some misconception here about a database vs your CSV/Excel file. To be clear, those are also databases but are what we refer to as "flat-file" databases. They are not what we call "relational" databases. And the reason to use a relational database over a "flat-file" database would be the same regardless of which language you are using.

Here's a fairly concrete example. You said that right now, you are collecting the names of everyone using your tools and each time you make a new row with the name and other information. My last name is Nordell. But what if I get married this weekend and my last name changes? All of your previous entries would point to my former name, but maybe you have 10,000 users. Maybe you don't want to keep up with name changes. The benefit here is that you can make that change in one place instead of replicating and replacing the data everywhere. My name would be linked to a unique ID that then holds my name (first and last). Because everything is related to that unique ID, when you then update my last name in the database, it changes everywhere.

There is nothing specific about the Python language that makes a database more or less efficient than a CSV. It depends largely on your data and what you're using it for. :smiley:

Hope this helps! :sparkles:

Huston Petty
Huston Petty
2,833 Points

WOW! That is an amazing explanation. I am now officially sold on Databases lol. Thanks a bunch and thank you for an extremely fast reply. :D

Just one more thing. Are there no built in libraries supporting Databases in Python? Or is peewee the only/best one? Or just a standard in the tech industry?

Thanks so much.

Jennifer Nordell
seal-mask
.a{fill-rule:evenodd;}techdegree
Jennifer Nordell
Treehouse Teacher

Huston Petty I know there's the Django ORM that works with databases and SQLAlchemy (though I've heard varying opinions on that one). I mostly have experience with Django ORM and peewee, though. You might get some of the big dogs like Chris Freeman or Chris Howell to weigh in on this one :smiley:

The most common I see for database ORM libraries for Python are:

  • Peewee
  • SQLAlchemy

We already have quite a bit of courses on Treehouse that use Peewee to interact with a database. Many of our Python Techdegree students learn the Peewee ORM way first when building their Project 4 and Project 5. Then get into Djangos ORM in Project 6.

If you are working with Django, Django has its own Django ORM to interact with databases.

There are some other ORMs like:

  • PonyORM
  • SQLObject

But I have never used either of these or read the docs on them.

And, hi Jennifer Nordell :wave: :smile:

Huston Petty
Huston Petty
2,833 Points

Ok gotcha. Great advice. Thank you both so much. I really appreciate it. :D