Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

Databases

Database schema issue

I am a mech eng. with some front end experience. I’m currently making a Data acquisition device using an Ardunio to collect data and package it(multiple users will submit several sessions per week per user and between 15-1000 entry arrays per session) It will need to be sent to the database as 1-D arrays and pulled as 1-D arrays, this should be possible for specific users and obviously secure their individual data.

Then the 1-D array will need to be pulled to the browser and processed using JS. Im now at a loss when deciding which database to use. Ive learnt a lot of JavaScript and thought using a JSON file with my 1-D array. and use node.js and mongo databases and avoid SQL and PHP? But have spoken to a friend who has suggested using SQL. Just because its more widely used. Now I’ve learnt the intro to SQL and dont know if I can now use a combination of SQL and JSON files.

Also I’m trying now to connect to MySQL sever with node.js which is challenging. What tracks and advice if any would be useful?

2 Answers

I was having fun with this, so here are suggestions and things for you to consider.

Database Schema

If I understand the data you intend to work with, this is a suggested schema for you:

At the table level, I believe you should have the following tables:

  • users
  • swimming_sessions

The users table should contain metadata for your user that helps you identify which is essential for authentication (logging in) and authorization (showing them only their own swimming sessions).

For the users table I suggest the following fields:
+---------------+---------------------------------------------------------------------------+
| id            | This should be a unique key that is generated by the database that allows |
|               | you to establish efficiently queryable relationships to records in        |
|               | other tables.                                                             |
+---------------+---------------------------------------------------------------------------+
| email         | You will likely want this to be a unique key, which will prevent other    |
|               | users from creating accounts with the same email but also make            |
|               | querying by this field more efficient.                                    |
+---------------+---------------------------------------------------------------------------+
| name          | This can be broken down into multiple fields if desired, but it is useful |
|               | to have some way of storing the user's name for use in your UI or emails. |
+---------------+---------------------------------------------------------------------------+
| password_salt | Preferable a randomly generated string that is used to create a           |
|               | password hash.                                                            |
+---------------+---------------------------------------------------------------------------+
| password_hash | While for your dissertation you may decide to forgo most security         |
|               | measures to simplify things, but realistically you will want to store     |
|               | only a version of a users password that has been hashed with an           |
|               | algorithm such as BCrypt (which has a node implementation that            |
|               | is easy to use: https://www.npmjs.com/package/bcrypt)                     |
+---------------+---------------------------------------------------------------------------+
| created_at    | Just a DATE or DATETIME. You will want this to mark the creation of       |
|               | the user record and store it as a UTC date.                               |
+---------------+---------------------------------------------------------------------------+
For the swimming_sessions table I suggest the following fields:
+--------------+----------------------------------------------------------------------------+
| id           | Similar to the users table, this should be a unique key generated by the   |
|              | database. While you might not need it for your current use-case, it will   |
|              | give you more flexibility with what you can do going forward with the      |
|              | data collected.                                                            |
+--------------+----------------------------------------------------------------------------+
| swimmer_id   | This is a foreign key that establishes a relationship with records in the  |
|              | users table. This is important for being able to efficiently get           |
|              | swimming_sessions for a user.                                              |
+--------------+----------------------------------------------------------------------------+
| lap_times    | This is the data that you want to store. I recommend storing the JSON      |
|              | you generate from your device as a string. Two types you can consider      |
|              | for this field are VARCHAR and TEXT (TINYTEXT, TEXT,                       |
|              | MEDIUMTEXT, or LONGTEXT). Each have different max sizes and                |
|              | other properties so be sure to do your research before deciding.           |
+--------------+----------------------------------------------------------------------------+
| created_at   | Similar to users, this DATETIME would mark the point the record was        |
|              | created at, which might be useful depending on if you insert the record    |
|              | as soon as the swimming session is ended.                                  |
+--------------+----------------------------------------------------------------------------+

My explanations for these fields are mostly at a higher level so not to take away from the glory of research and reading MySQL docs to determine how to actually build your schema :P

Performance and Scalability

Depending on how much you need to consider performance and scalability you could go down a very deep rabbit hole. Running your database with some random hosting provider can be good for a small project but doesn't likely offer a clean path towards handling rising levels on input and output.

Amazon Web Services (AWS), Google Cloud or other cloud computing platforms offer options for vertically and horizontally scaling databases, as well as offering a number of other services for caching responses to save your database from needing to handle every request.

Since you say in theory your use-case is going to be read heavy, there is an interesting, and I would say more advanced solution that might afford better scalability. I have not personally done this, but if you are feeling adventurous, you could consider using MySQL for storing user data, but use a file storage solution such as AWS S3 for storing the actual session data. This could automatically give you a lot of advantages with scaling and is could interface relatively easily with services like AWS CloudFront to provide caching.

Security

You will need to consider how you will handle logins and authentication. If that is outside the scope of your dissertation then mentioning it when you discuss security could be sufficient.

I can't really tell you which type of database to use (at least not with a lot more info about what you are trying to do :P), but I can give you some things to think about and potential tools to look into....

Some questions to consider:

  • Is this for a school project? Personal project? Work project?
  • How many concurrent users will you need to support?
  • How often will data be written vs read?
  • What types of records will you be storing? Sounds like there is at least a concept of users and session_entries (though I would be more specific if possible).
  • Will you need to query anything by the arrays of data you are storing? If you are only searching by user then a relational database could make a lot of sense.
  • Can specific values in the arrays be updated after the initial insert?

Treehouse courses for using SQL with Node:

Sequelize could be overkill for your project, but might be a good choice if you need to change from one database type to another, though I would not recommend changing databases if possible.

For different SQL databases you can use the following drivers:

I also recommend looking into docker and docker-compose for a local development environment. There is a bit of a learning hurdle, but once you jump that it is quite easy to create a local API connecting to a local database without needing to worry about installing the database to your machine.

Its for a dissertation so I’m mainly focusing on the Data aquisition device. It is built to record swimming records so it will track and log individual lap times. That will be stored in an array (this can be 15 values or 500values long depending on length of the session and they should be able to store different sessions even on the same day). Right now the database and browser are to demonstrate the DAQ device but would only need to prove the concept. (I literally have a new found respect for databases and data management after all this).

So for the system I have currently I’d like to at least allow for one person to write and read, with a bias on reading because once they are in the database they should never change. However in my report I will mention security and scaling issues and what approaches would be best in this situation. To scale to a fitbit kind of situation.

I wanted to use JSON files to hold ARRAYS(15-500 values long). Because I dont need the database to understand anything it just needs to hold the array as an array and be able to pass it on to the browser. but is it possible to store several JSON files in a database? (Sorry if this is a dumb question) nothing along this project has been like I thought it would so lol.

Right now I have MySQL workbench and started to make my database. But the schema is really bothering me. I do agree now that a relational database will be the best because I can store users and then find a way to relate each user to a dynamic table. I currently have two ideas which use a users table with everyone on the system then, (1) each table belongs to a user and each column is that users array for that session [an extra session would be a new column in a table belonging to that user] or (2)where each user uses the same table, each column is a user and each value is a different JSON file with an array within it [so a new session is a new JSON file stored as a value in the users column. These are the two ideas I’m playing with.

So right now i have the DAQ device recording values in an array. (NEED A WAY TO SEND IT TO THE DATABASE)

I have a Database schematic plan (explained above)

I have installed node and am trying to use a JavaScript file to connect to MySQL workbench, (having issues with the terminal not giving me access to MySQL)

I have the front end of the website done and will only need to finish the back-end but I’m more able in that respect. But would I use docker with node or instead of node?

Many apologies if I’m not explaining anything in depth enough or do not make 100% sense. Ive never knew how much work went into all the databases and data management and handling. So I am extremely new to this. And thank you so much for you response!