Date post: | 11-May-2015 |
Category: |
Technology |
Upload: | soeren-auer |
View: | 2,413 times |
Download: | 2 times |
Das Semantische Daten Web für Unternehmen
Vision, Technologie, Anwendungen
Sören Auer
Forschungsgruppe AKSW
Web server
Web server
Warum Semantic Web?Problem: Try to search for these things on the current Web:• Apartments near German-Russian bilingual childcare in Leipzig.• ERP service providers with offices in Vienna and London.• Researchers working on multimedia topics in Eastern Europe.Information is available on the Web, but opaque to current Web search.
leipzig.deHas everything about childcare in Potsdam.
Immobilienscout.deKnows all about real estate offers in GermanyDB
Web server
DB
Web server
Search engineHTML HTML
RDF RDF
Solution: complement text on Web pages with structured linked open data & intelligently combine/integrate such structured information from different sources:
Vom Web der Dokumente zumSemantic Data Web
Web (since 1992)• HTTP• HTML/CSS/JavaScript
Semantic Web(Vision 1998, starting ???)• Reasoning• Logic, Rules• Trust
Social Web (since 2003)• Folksonomies/Tagging• Reputation, sharing• Groups, relationships
Data Web (since 2006)• URI de-referencability• Web Data integration• RDF serializations
The Long Tail of Information DomainsPictures
NewsVideo
Recipes
Calendar
Currently supportedstructuredcontent types
SemWeb supported structured content
Genesequences
Itinerary ofKing George
Talentmanagement
Popu
larit
y
Not or insufficiently supported content types
The Long Tail by Chris Anderson (Wired, Oct. ´04) adopted to information domains
… …
Requirements-Engineering
……
Special interestcommunities
Die Vision: ein Web Vernetzter Daten
20082007
20082008
20082009
2009
Virtouso
SemMF
SILK
poolparty
DL-Learner
Sindice
Sigma
ORE
OntoWiki
MonetDB
DXX Engine
WiQA
repair
interlink
fuse
classify
enrich
create
Semantic Web - Standards
6
Standardization Semantic Web
1994
1998
• First public presentation of the Semantic Web idea
• Start of standardization of data model (RDF) and a first ontology languages (RDFS) at W3C
2000• Start of large research projects about
ontologies in the US and Europe (DAML & Ontoknowledge)
2002• Start of standardization of a new
ontology language (OWL) based on research results
2004• Finalization of the standard for data
(RDF) and ontology (OWL)
2006
• Standardization of a quer y language(SPARQL, 6. April 2006)
• Ongoing work on rule languages(SWRL, DL-safe rules, RIF)
• Extension of OWL to OWL 1.1 / 2.0• Ontology language of OMG based on UML
(ODM)
Semantic Web Architecture
Now standardized
Current research
2008 •RDFa
2009 •OWL2
Data Zugriff und Integration auf semantischer Ebene
Object-relational mappings (ORM)• NeXT’s EOF / WebObjects• ADO.NET Entity Framework• Hibernate
Entity-attribute-value (EAV)• HELP medical record
system, TrialDB
Column-oriented DBMS• Collocates column values
rather than row values• Vertica, C-Store, MonetDB
Data Web• URIs as entity identifiers• HTTP as data access protocol• Local-As-View (LAV)
RDBMS• Organize data in
relations, rows, cells• Oracle, DB2, MS-SQL
Triple/Quad Stores•RDF data model•Virtuoso, Oracle,
Sesame
Dat
a M
odel
s Others• XML, hierachical, tree,
graph-oriented DBMS
Procedural APIs• ODBC• JDBC
Dat
a Ac
cess Query Languages
• Datalog, SQL• SPARQL• XPATH/XQuery
Dat
a In
tegr
ation
Linked Data• de-referencable URIs• RDF serialization
formats
Enterprise Information Integrationsets of heterogeneous data sources appear as a single, homogeneous data source
Data Warehousing• Based on extract,
transform load (ETL)• Global-As-View (GAV)
ResearchMediatorsOntology-basedP2PWeb service-based
Linked Data Web Technologie1. Nutzt RDF als Datenmodel
LSWT2010
Leipzig
6.5.2010
AKSWorganizes
takesPlaceAt
takesPlaceIn
2. Ist serialisiert in Triple:AKSW organizes LSWT2010LSWT2010 takesPlaceAt “20100506”^^xsd:dateLSWT2010 takesPlaceAt Leipzig
3. Nutzt Content-negotiation
RDF Vokabulare:Klassen & Eigenschaften Hierarchien
Beer rdf:type rdfs:ClassBottomFermentedBeer rdfs:subClassOf BeerBock rdfs:subClassOf BottomFermentedBeerLager rdfs:subClassOf BottomFermentedBeerPilsner rdfs:subClassOf BottomFermentedBeer
hasContent rdf:type rdfs:PropertyhasAlcoholicContent rdfs:subPropertyOf hasContenthasOriginalWortContent rdfs:subPropertyOf hasContent
9
RDF-S Instanzen• Instanzen sind einer oder mehreren Klassen zugeordnet:
Boddingtons rdf:type AleGrafentrunk rdf:type BockHoegaarden rdf:type WhiteJever rdf:type Pilsner
10
Vokabulare: Friend-of-a-Friend (FOAF)
• defines classes and properties for representinginformation about people and theirrelationships
Soeren rdf:type foaf:PersonSoeren currentProject http://OntoWiki.netSoeren foaf:homepage http://aksw.org/SoerenSoeren foaf:knows http://sembase.at/TassiloSoeren foaf:sha1 09ac456515dee
11
Integration von RDF und HTML: RDFa
12
<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p property="foaf:name"> Alice Birpemswick </p> <p> Email: <a rel="foaf:mbox" href="mailto:[email protected]">[email protected]</a> </p> <p> Phone: <a rel="foaf:phone" href="tel:+1-617-555-7332">+1 617.555.7332</a> </p></div>
Anwendungs- und Einsatzpotentiale im Unternehmen
1. Integration heterogener Informationsbestände mittels Ontologien und Hintergrundwissen (z.B. DBpedia)
2. Semantische Wikis (z.B. OntoWiki) helfen strukturierte Wissensbasen zu erstellen und managen
Transformation von Wikipedia in eine Wissensbasis
• community effort to extract structured information from Wikipedia and to make this information available on the Web
• allows to ask sophisticated queries against Wikipedia (e.g. universities in brandenburg, mayors of elevated towns, soccer players), and to link other data sets on the Web to Wikipedia data
• Represents a community consensus• Recently launched DBpedia Live transforms Wikipedia
into a structred knowledge baseS. Auer; C. Bizer, J. Lehmann, G. Kobilarov, R. Cyganiak, Z. Ives: DBpedia: A Nucleus for a Web of Open Data. 6th International Semantic Web Conference ISWC 2007.S. Auer, J. Lehmann: What have Innsbruck and Leipzig in common? Extracting Semantics from Wiki Content. 4th European Semantic Web Conference, ESWC 2007.
Structure in Wikipedia
• Title• Abstract• Infoboxes• Geo-coordinates• Categories• Images• Links– other language versions– other Wikipedia pages– To the Web– Redirects– Disambiguations
Infobox templates{{Infobox Korean settlement| title = Busan Metropolitan City| img = Busan.jpg| imgcaption = A view of the [[Geumjeong]] district in Busan| hangul = 부산 광역시...| area_km2 = 763.46| pop = 3635389| popyear = 2006| mayor = Hur Nam-sik| divs = 15 wards (Gu), 1 county (Gun)| region = [[Yeongnam]]| dialect = [[Gyeongsang]]}}
http://dbpedia.org/resource/Busan
dbp:Busan dbpp:title ″Busan Metropolitan City″dbp:Busan dbpp:hangul ″ 부산 광역시″ @Hangdbp:Busan dbpp:area_km2 ″763.46“^xsd:floatdbp:Busan dbpp:pop ″3635389“^xsd:intdbp:Busan dbpp:region dbp:Yeongnamdbp:Busan dbpp:dialect dbp:Gyeongsang...
Wikitext-Syntax
RDF representation
Eine große multi-linguale, multi-domänen Wissensbasis
DBpedia Extraktion resultiert in:• Beschreibungen von ca. 3.4 Millionen Dingen (1.5 million classified in a consistent
ontology, including 312,000 persons, 413,000 places, 94,000 music albums, 49,000 films, 15,000 video games, 140,000 organizations, 146,000 species, 4,600 diseases
• Labels und Zusammenfassungen in 92 verschiedenen Sprachen; 1,460,000 links to images and 5,543,000 links to external web pages; 4,887,000 external links into other RDF datasets, 565,000 Wikipedia categories, and 75,000 YAGO categories
• Zusammen mehr als 1 Milliarde Fakten (d.h. RDF triple): 257M from English edition, 766M from other language editions
DBpedia hinterläßt sichtbare Spuren in Wissenschaft, Technologie and Gesellschaft• DBpedia became the central interlinking hub on the Data Web• Scientific publications attracted more than 500 citations• More than 15.000 monthly visits on DBpedia.org,
numerous press articles, blog posts …• Ecosystem of commercial and community applications:
ThomsonReuters, BBC, Neofonie, Openlink, Faviki …
Das Semantische Daten Wiki
• Agiles, verteiltes Knowledge Engineering• Kein Wiki mit semantischer Erweiterung (Semantic
MediaWiki, IkeWiki), sondern Ontology Editor der Wiki Konzepte nutzt:– Make it easy to
correct mistakes(ant intelligence)
– Activity can bewatched andreviewed
– Everything canbe undone
AKSW Vorstellung
OntoWiki: Catalogus Professorum
SoftWikiProblem: Requirements Engineering mit großen, geografisch verteilten Stakeholder-Gruppen
Lösung: umfassende Ontologie für RE Wissen + adaptierte OntoWiki Anwendung
Anwendung von TextminingAlgorithmen für DuplicateDetection
OntoWiki: Vakantieland
Take Home Messages
• Semantic Web• Unterstützt die Integration von Daten im Web
(einheitliches Triple-Datenmodel)• Standardisierte (W3C) Linked Data Technologiebasis• Ontologien und Hintergrundwissen (z.B. DBpedia)
hilft bei der Integration heterogener Informationsbestände
• Semantische Wikis helfen RDF Wissensbasen zu erstellen und managen
Vielen Dank!Sören [email protected] Knowledge Engineering & Semantic Web (AKSW)
http://aksw.org
Mediencampus “Villa Ida”
Berufsbegleitender Masterstudiengang“Content- & Media Engineering”M1: Medienproduktion (GMP) M2: Web-Technologien (WT) M3: Content- und Wissensmanagement-Systeme (CWM) M4: Crossmediale Produktion (CP) M5: Medienwirtschaft und Medienmanagement (MW) M6: Projektarbeit (PA) M7: E-Business (EB)
http://www.leipzigschoolofmedia.de/