Integrated data analysis on multiple institutions

Background

In recent years, as data collection and accumulation have become easier, various companies and institutions have been accumulating their own data and working on analysis using artificial intelligence (AI). To improve the performance of AI analysis, it is necessary to collect a sufficient amount of data. Therefore, it is expected that highly accurate analysis will be possible if data held by multiple institutions can be integrated for analysis. However, it is difficult to share the original data containing personal information and trade secrets across institutions. Therefore, there is a need for technology that enables integrated analysis across corporate and institutional boundaries without sharing the original data.

 

Outline and research topics

  • Development of the data collaboration (DC) analysis for integrated analysis of data held by multiple institutions without sharing.
    • Data collaboration (DC) analysis is a recent technology which enables integrated analysis of data from multiple institutions by sharing only “intermediate representation ” converted from the original data using AI technology, instead of the original data including privacy information. DC analysis enables the handling of a large amount of data while ensuring the security of highly confidential information contained in the original data, thereby significantly improving the performance of AI analysis.
    • Example of research topics:
      • Development of DC technology for various types of data and tasks.
      • Advancement of DC technology.
  • Application of DC technology for real data from companies, municipalities, hospitals, etc.
    • We are conducting demonstration tests of DC analysis for actual data from companies, municipalities, hospitals, etc., with which we are conducting joint research.
    • Example of research topics:
      • Performance evaluation of DC analysis for real-world data.
      • Development of data analysis and preprocessing techniques for actual data.