Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Posted June 21, 2019 1:55am by

Working on Python Project and think I have the steps confused. Any help would be great.

I found my data and it has over 1000 rows and a "gazillion columns", I'd like to limit the rows to 250 (randomly). I created tables (in notebook) for the columns I am drawing from the CSV table published online (which I downloaded locally). I understand I have to enter values from the downloaded CSV table randomly and I can possibly use Pandas to do so, but my crazy question is: which comes first.....creating the tables in "the new database for the project", or manipulating the data in the downloaded version first? Or am I overthinking the entire process? BTW, I am also using DB Browser to ensure I am creating the correct tables. Your assistance/help would be greatly appreciated.

Posting to the forum is only allowed for members with active accounts.
Please sign in or sign up to post.

Welcome to the Treehouse Community

Looking to learn something new?

Nina Maxberry

Nina Maxberry

Working on Python Project and think I have the steps confused. Any help would be great.