1. Basics about big data
1.1 Characteristics of big data:
- volume (terabytes - zettabytes -> really a lot)
- variety (structured, polystructured, unstructured)
- velocity (batch, streaming data)
2. Applications
- Data-intensive applications, challenges, techniques and technologies: A survey on Big Data
by C.L. Philip Chen, Chun-Yang Zhang
http://www.crisismanagement.com.cn/templates/blue/down_list/llzt_dsj/Data-intensive%20applications,%20challenges,%20techniques%20and%20technologies%20A%20survey%20on%20Big%20Data.pdf
3. Products/Tools
- NoSQL (Accumulo, Aerospike, Alchemy Database, AllegroGraph, Apache CouchDB, ArrangoDB, Berkeley DB, Cassandra, Clusterpoint, CortexDB, Couchbase, DocumentDB, Druid, Dynamo, FairCom c-treeACE, FoundationDB, Giraph, HBase, HyperDex, InfiniteGraph, Lotus Notes, MarkLogic, MemcacheDB, MongoDB, MUMPS, Neo4J, Oracle NOSQL Database, OrientDB, Qizx, Redis, RethinkDB, Riak, Stardog, Vertica, Virtuoso)
- MapReduce (Apache Hadoop MapReduce, disco, DryadLINQ, MATLAB MapReduce, QtConcurrent, Skynet, Splunk, Stratosphere)
- Storage (S3, Hadoop Distributed File System)
- Servers (EC2, Elastic, Google App Engine)
- Processing (BigSheets, ElasticSearch, R, Splunk, Solr/Lucene, Yahoo! Pipes)
4. Useful links
- In general Wikipedia: https://en.wikipedia.org
- Open source: http://hadoop.apache.org
- Commercial products IBM: https://www-01.ibm.com/software/data/bigdata/
- Commercial products Splunk: http://www.splunk.com
- Implementation for BTC blockchain based on Splunk: https://m.imgur.com/a/xA8Sl
Do you have some interesting articles regarding big data? Please reply :)
Thanx Andreas