Big Data

  • The phrase big data is often used in enterprise settings to describe large amounts of data. It does not refer to a specific amount of data, but rather describes a dataset that cannot be stored or processed using traditional database software.

    Examples of big data include the Google search index, the database of Facebook user profiles, and Amazon.coms product list. These collections of data (or datasets) are so large that the data cannot be stored in a typical database, or even a single computer. Instead, the data must be stored and processed using a highly scalable database management system. Big data is often distributed across multiple storage devices, sometimes in several different locations.

    Many traditional database management systems have limits to how much data they can store. For example, an Access 2010 database can only contain two gigabytes of data, which makes it infeasible to store several petabytes or exabytes of data. Even if a DBMS can store large amounts of data, it may operate inefficiently if too many tables or records are created, which can lead to slow performance. Big data solutions solve these problems by providing highly responsive and scalable storage systems.

    There are several different types of big data software solutions, including data storage platforms and data analytics programs. Some of the most common big data software products include Apache Hadoop, IBMs Big Data Platform, Oracle NoSQL Database, Microsoft HDInsight, and EMC Pivotal One.