These frequently asked questions on the fair data economy and Ihan.fi website have been collected during the project, among others during the “Sneak peek” sessions in June and October 2020.
If you have a question you cannot find the answer to, please contact us at ihan@sitra.fi. We are happy to help!
Fair data economy and IHAN project
1. What are the data economy and the fair data economy? What is IHAN project?
The data economy is a universe of initiatives, activities and/or projects whose business model is based on the exploration and exploitation of the structures of databases to identify opportunities for generating products and services. The fair data economy is a part of the economy that focuses on creating services and data-based products in an ethical manner. Fairness means that the rights of individuals are protected and the needs of all stakeholders are taken into account, and this is what Sitra’s IHAN project is driving.
2. What are the differences and similarities between IHAN project and the Gaia-X initiative?
Gaia-X is a project aimed at creating a European data infrastructure; a federated secure solution to support digital sovereignty and innovation. IHAN as a project enables the development of services based on data sovereignty, sharing the objectives of Gaia-X.
Ihan.fi and the testbed
3. What is ihan.fi?
The ihan.fi site demonstrates how services can be built with the IHAN fair data economy infrastructure. The testbed, tools and demo applications for creating fair data economy services will be compiled under the ihan.fi platform, which will support service development on a one-stop-shop basis.
The site is being built in phases throughout 2020. The alpha version was released in June. During the summer, selected companies will start experimenting on the testbed. The testbed will later be released for the whole internet community.
The long-term goal of ihan.fi is to support new internet standards for data productising, portability and interoperability, and to boost the emergence of global data markets.
4. Why was ihan.fi built?
Sitra’s task is to build a successful Finland for tomorrow. The purpose of the fund is to promote the stable and balanced development of Finland, quantitative and qualitative economic growth and international competitiveness.
The better use of data and the emergence of data markets will enhance growth. Sitra has here a clear role as a neutral mediator, as regulation proceeds with it’s on pace, and companies are not necessarily capable of investing in this type of testbed. As a public body, all of our outputs are by default open and available to everyone.
5. What is the target group of ihan.fi site?
The site is for all organisations and individuals interested in the data economy.
6. What can I do on the site?
In the alpha-phase you can check out for example the rulebook template and pre-standards for data sharing, as well as a well-being app demo set up according to IHAN requirement specifications which offers a great example of a fair data economy service. In spring 2021, developers will be able to log into the development environment, to test building those real fair data economy services with previously tested components and to provide their own data sets for others to use.
7. Can I get involved now?
You can join the community by ordering our newsletter.If you have a ready-to-go business pilot focusing on the exchange and reuse of data, you can contact us at ihan@sitra.fi for details. By default, we offer the testbed and technical support free of charge. We will not be funding any new pilots at this time.
8. How can I receive information about the opening of the site to everyone later on?
For the final and full version, we will arrange a launch event in spring 2021. Check out the current release of ihan.fi and stay tuned for the next developments. More info on the project page.
9. Is the testbed solution purely for Finland? If not, how are you planning to drive a unified data language for global harmonisation? How will you deal with different character sets for true internationalisation?
As the standards, architectures and concepts are built for the future Web, the testbed is also built for the global community. We welcome any nationality, company, individual and organisation to use the testbed as long as you adhere to the testbed rulebook defined by the community. Global standardisation relies on separating the linguistic semantics from the real-world context. The data product is a meta wrapper that describes the context of an IP payload in terms that both humans and machines can understand. Part of those standard attributes are the language the product is made available in. This allows, for example, to create a energy certificate of a building in any language.
Data product standards are meant to be linked directly with the real world industrial and international standards used in commerce, trade, finance, banking, manufacturing and the built environment. By defining the digital identities, e.g. through the harmonised system, we can enable adoption in any industry and company to model data products in any manufactured or traded commodity globally. The same goes too for the financial assets modelled by the ISIN-coded identities.
The scaling work has been kickstarted through open collaboration, agile experimentation, strong community and with the backing of Sitra’s role as a public utility with strong European connections. We are building on the proven and scalable technologies and concepts.
10. How is the privacy of the test user companies protected?
In this phase, the testbed is a closed environment. Security is one of the key aspects in the architecture design and audits are being planned as part of the development phases. In 2021, when the testbed is opened to public, a common rulebook will define the rules for operating in the ecosystem.
11. How do we ensure that standardisation of data productising proceeds on EU level?
An open and motivated community and real cases built by the leading industry players and governments are the keys to success.
12. What is the approach to enabling semantic interoperability in the testbed?
The approach is linked digital identities and data products with global standards. The semantic interoperability is created by offering global core standards for data products and also digital identities. On the testbed the OpenAPI specification and JSON-LD based vocabulary and ontology is available in Github as a core ontology you can use to build your own industry and company specific ontologies. The testbed also offers the product gateway and the global identity graph for semantic discovery and full interoperability between software and data sources. The data source is linked only once to the standards through the data product onboarding process. After that the data becomes discoverable without any additional harmonisation effort needed from the consumers of data. One identity, all the data – any application.
13. Are all the datasets in the system based on JSON-LD effort?
At this point of the standardisation the data products are based on OpenAPI specification 3.0 and the JSON-LD which together enable structured data exchange with contextual linking. These are so far seen as the most prominent candidates for standardisation as the OpenAPI specification has a strong footing in the market as does JSON-LD in W3C. Given that after a few years of practical use we have found out that it also needs further development.
14. Is data sharing a one-time deal or is there an expiry date with proper permission management?
Several levels of access control mechanism are being planned for the IHAN testbed. The mechanism with data products can be based on either person explicitly giving and revoking it, or some automated mechanisms. There are different use cases identified that need to be supported and will be investigated.
15. My company is building an ecosystem for different roles and partners in which we will give data control completely back to the users. Can we use the data standards and protocols in the IHAN testbed as a service to integrate into our ecosystem? As I understand, we could then work as a data source via IHAN for third parties. Is that correct?
The whole point of the standards is that you can use them any way you wish to create your own ecosystem. If you want to use the testbed, the only restriction is that you accept the terms and conditions based on the upcoming testbed rulebook. You can then offer data to anybody in the IHAN ecosystem or build solutions using any data offered as you wish.
16. Is there interest in being involved as an interoperability test area in the new Green Deal efforts?
The IHAN testbed is open to all projects and pilots creating new services based on data sovereignty.
17. In fair data economy, would you avoid winner-takes-all companies, and how? They could start with public data, build a good user experience and continue with a closed system?
From our perspective, we should not try to avoid them. If in the future we could more easily release public data from silos to use for real-time economy solutions built on the public data maintained with public money that is truly beneficial to companies, then we believe doing so could be a desired end result.
18. Does the testbed include any MyData Operators?
The IHAN testbed does not include or require MyDataOperators or any other intermediaries; however, if there is a pilot project that wishes to create or test new services based on those, they are welcome.
19. When organisation A holds data about me and I want to give organisation B permission to access that data held by A, is there currently, or will there be in future, any way to facilitate this? How would this be communicated and authenticated?
Planned consent mechanisms in the IHAN testbed architecture will control this. The most suitable solutions and the provider for this feature are currently being assessed via the proof-of-concept cases and this is a planned feature of the IHAN testbed roadmap. When an application makes a request to a product gateway, it forwards it to the productiser that can make additional checks on the user or application level to ascertain whether the required permissions are in place to release the data from the data provider.
20. How about the pricing of the standardised data products? Is it included or can it be included?
The IHAN testbed is for trialling and testing and therefore does not enable any kind of commercial activity. We will include test payment capabilities on the testbed as a way of seeing how it might work out in production, but no real money will ever be transferred on the testbed.
SISU ID
21. For interoperable authentication, do you use OpenID Connect or something else that is already a standard?
SisuID supports both OpenID Connect (OIDC) and SAML 2.0 integration standards, but OIDC is the preferred integration method for relying services.
22. Which database technology and architecture are you using?
SisuID is using relational database management system (RDBMS) technology. Any SQL database like MySQL, Oracle or MS SQL server can be used. The current version of the SisuID is using MySQL database running on AWS.
23. Is SisuID using self-sovereign identity technology?
Not yet, but we have already carried out a proof-of-concept pilot implementation in the previous phase of the project where we implemented decentralized identity capabilities in the SisuID ecosystem on top of the FIndy identity ledger.
24. Is it possible to add more companies to the SisuID other than the two mentioned?
SisuID is the trusted identity solution for individuals. Anybody can currently create a trusted identity by verifying it either with bank credentials or a passport. People with a trusted identity can then be linked to the organisation(s) that use SisuID as an authentication method in their data-sharing solution(s). So, SisuID can be scaled for the use of many companies.
Well-being application demo (11 June 2020)
25. In the well-being app, who is running the AI that Aino is using? Is this provided by the data operator as a service? Who is “listening” to the conversation with AI Bot?
The AI is just software in the Web i.e. selected and ran by the application provider and is open for any suitable AI/NLP solution to be used as long as the application uses the data product standards enabling the easy use of any compatible online data.
As the application developer, you decide who “listens in” to the data traffic by selecting the most suitable AI solutions, cloud services and data sources for your applications – it’s just the internet. With the help of trust architecture you can of course have better authentication and consenting mechanisms. We also plan to implement the data product exchange protocol with a fully secure data exchange layer preventing any “man-in-the-middle” attacks or any party in the network being privy to the data product contents that is being transferred.
In the current architecture there is no “data operator” role, only data providers and application developers. In reference to “MyData operators” they are just that – providers of the GDPR-compliant consent for data use and the decentralised trust oracles the productisers use to validate the existence of consent before the data products are released.
26. In the well-being app – who is paying for the data and when?
The point of the demo wasn’t first most demonstrate a business model rather the real value created for an individual with real data that “never meets”. For this demo the most probable customer would be an app user where the price of the data products are included e.g. in a monthly app subscription that yields profit for the app developer. It is up to you to decide who should pay and when – the options are not limited anymore to primarily selling the users’ data for marketing rather creating real world services with it.
27. In the well-being app, what is the governance behind the data exchange model?
The governance model is always selected by the data sharing communities and ecosystems themselves. The rulebook offers a great starting point for deciding on your own governance model. The data product exchange standards should allow for automated policy control between different ecosystems and rulebooks to also enable global interoperability between the governance models.
Vero (Finnish Tax Administration) and PRH (Finnish Patent and Registration Office) proof of concept (PoC): Fast Track to Finland (20 October 2020)
28. Why did Vero choose this process for a proof of concept?
We wanted to prove that an old and outdated process could be replaced with a modern digitalised system where data could be shared in real time between different organisations, which would also enable a high level of automation. The current process is slow and manual, a product of an era based on paper. In addition, it is not user-friendly, nor does it allow any cost savings through automation for the organisations providing services. A digitalised, fast and user-friendly process works better for attracting new companies and investments (taxpayers) to Finland. So, in this sense, we also are trying to ensure our future tax revenue with this PoC.
29. Did Vero start by digitalising the existing process or did you completely redesign the process itself using modern insights?
Digitalising the Finnish tax administration has been an enormous process over the last two decades, featuring numerous different types of development projects. The furthest progress was made by a greenfield project in which over 70 different taxation systems were replaced by one complete taxation software package (GenTax). That allowed development of various e-services for customers, such as pre-filled online tax returns, online tax cards and MyTax.
Recommended
Have some more.