When creating an application that solves a problem, very often, data from different domain has to be analyzed and then combined to generate the desired results.
In this sense data integration or data fusion is the process of combining multiple types of data that represent the same object into a consistent, accurate, and useful representation. Another definition that considers how the data sets relate to each other can be : data fusion is the analysis of different data sets where every data set can interact, inform and complete the other data sets.
The result of the data fusion process is expected to be more informative and synthetic than the original inputs.
This means that the result of the data fusion process, once that N different datasets are integrated should be worth more that the sum of each single dataset's result.
As an ideal example of data fusion system can be considered the human brain where the most intuitive data fusion process is audio and video fusion from the two respective human senses.
A less ideal one is multi-sensor data fusion, which combines data from different sensors. It is less ideal as there is much more error in the process of data fusion compared with the human brain.
There are different algorithms that participate in a data fusion process. Some of the most famous are KNN (K-Nearest Neighbour) , K-Means, Collaborative filtering methods, PDA (Probabilistic data association) which assigns an association probability to each hypothesis, semantic methods , co-training, manifold alignment, bayesian networks and many others.
A model that tries to standartize without success the process of data fusion is JDL . However it exists as a useful for visualizing the data fusion process and facilitating discussion and common understanding.
Beside fields as aerospace and defence there are many other fields in science using data fusion.
Some examples of data fusion processes are :
1- Using different sensors to monitor better a certain environment in terms of its physical properties as temperature, pressure, humidity and etc..
2- Using different sensors such as proximity, position, pressure, light etc .. in robotics. A robot, typically has to put together all these data coming from the different sensors and process a result in order to take decisions.
3- Using data fusion for recognizing certain features in videos for automatic classification of the video content.
4- A more simple example is also triangulation. It is used by LBS(Location Based Services) to make a more precise geo-location when the GPS is not available. It also may be used to calculate the coordinates and distance from the shore of a ship as of Wikipedia example.
With this post I want to point to some of the most relevant papers I have read in data fusion.
a) - One that I liked very much is that of Dana Lahat and others in https://hal.archives-ouvertes.fr/hal-01179853/file/Lahat_Adali_Jutten_DataFusion_2015.pdf (https://hal.archives-ouvertes.fr/hal-01179853/file/Lahat_Adali_Jutten_DataFusion_2015.pdf)
I reviews different works but above all presents a way to insert the data fusion process in a framework by classifing those processes depending on the level of data fusion, without avoiding mathematical topics.
b)- also the paper of Yu Zheng in cross domain data fusion is worth reading. It clearly distinguish between traditional data fusion as data matching, and cross domain data fusion using machine learning tecniques by fusion information of higher level. Methodologies for Cross-Domain Data Fusion: An Overview (http://research.microsoft.com/apps/pubs/?id=252114)
c)- a third paper which is of minor importance but presents a nice overview on data fusion methods is also "A review of data fusion techniques" of Castanedo. http://www.hindawi.com/journals/tswj/2013/704504/ (http://www.hindawi.com/journals/tswj/2013/704504/)
d) - an example of data fusion method applied to the Call Description Records data analysis for event detection : Data fusion for city life event detection (http://link.springer.com/article/10.1007/s12652-016-0354-7) . There is also a publically available presentation of the paper in .ppt in SlideShare : Data fusion for city live event detection (http://www.slideshare.net/alketcecaj/data-fusion-for-city-live-event-detection-62999960)
**e) - **data fusion used for re-identification of anonymized mobile phone users or CDR : Re-identification and information fusion between anonymized CDR and social network data (http://link.springer.com/article/10.1007/s12652-015-0303-x?no-access=true) A short .ppt presentation of the paper can be found in slideshare : Re-identification of Anonymzed CDR datasets using Social network Data (http://www.slideshare.net/alketcecaj/reidentification-of-anomized-cdr-datasets-using-social-networlk-data)
Data fusion is a process that is evolving in time as more data and techniques (specially machine learning techniques) become available. However there remain different open challenges such as the lack of data fusion standards and a "structured data fusion" framework.
Part of the process of data fusion applied to location data is also summarized in my PhD thesis : Information Fusion Methods for Location Data Analysis (http://www.slideshare.net/alketcecaj/information-fusion-methods-for-location-data-analysis)
This post comes from my "Algorithms and data fusion" blog on Quora. You can find more here https://algo-data.quora.com/Data-fusion-an-overview-of-some-relevant-works?srid=n9bS
Good
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit