There is a lot of data and it is coming from everywhere in all industries. Let's talk about why.
Every day, we create 2.5 quintillion bytes of data. 0:00 That is so much data that 90% of the data in the world today has been 0:06 created in the last two years alone. 0:11 This much data demands specific systems and tool sets. 0:15 The tools of big data are built exactly for this problem. 0:18 Large and complex data sets. 0:22 Now, as tempting as it may seem, you couldn't and shouldn't. 0:25 Just write out a quick Python script to process potentially 0:28 hundreds of gigabytes of data. 0:31 Being aware of existing tools that can be used to handle this data is becoming 0:33 absolutely critical for the modern software developer. 0:37 Once you're aware of the existing software, you will no longer get stuck 0:41 trying to solve insurmountable problems with simple scripts. 0:44 Now the good news is that most big data tools will actually work with both small 0:48 and large amounts of data. 0:52 Now one thing to keep in mind however, is that with smaller data sets, 0:54 there will always be some amount of overhead that we'll 0:57 incur by using these big data processing frameworks. 1:00 So you should always use these with a touch of discretion. 1:03 For example, if you only ever have a few thousand lines of structured text data and 1:07 you only need it in one application. 1:11 You probably wanna just store that in a CSV, or 1:13 comma separated values file, instead of going 1:16 overboard with storing it in some sort of multi server distributed key value store. 1:19 You wanna make sure you follow the KISS rule. 1:24 Keep It Simple Smarty pants, or something more real than that. 1:27 These big data tools also have the need to be able to adapt to 1:32 various amounts of data types and structure. 1:35 From text, to audio and video. 1:38 Developers that work in these more specialized domains should be aware of 1:40 the tools and constraints in their problem space to address their specific problem. 1:43 If you're working in one of these domains, you'll need to adapt 1:48 your workflow to fit into the existing ecosystem of big data tools. 1:50 If you take a moment and look around, you'll start to see that we have 1:55 data sources popping up from nearly every domain. 1:58 [SOUND] Sensors are being used to gather climate information, 2:00 posts to social media sites. 2:04 Digital pictures and videos, transaction records from purchases, 2:06 cell phone GPS signals, and so much more. 2:10 Whether you are a developer or a technical expert, being familiar with these use 2:13 cases allows you to provide tremendous value to your company or organization. 2:17 They need your expertise in recognizing the challenge and 2:22 suggesting viable systems to use to solve them. 2:25 So now that you're starting to understand the importance of big data, let's take 2:29 a look at some of the new problems that we're dealing with due to big data. 2:32 Right after this quick break. 2:36
You need to sign up for Treehouse in order to download course files.Sign up