N:::만지작 거리기

N_17. 그래프에 담기/SPARQL

joyHong 2021. 9. 9. 01:41

02.QueryToGraph_SPARQL
Graph내에서 탐색 및 SPARQL 사용

파일로 존재하는 RDF데이터를 로딩하여 그래프에 담은 뒤, 그래프 내에서 여러가지 탐색을 사용해 본다.

그래프에 담긴 데이터를 SPARQL을 사용하여 질의하여 결과를 도출해 본다.

작성자 : 허홍수
e-mail : su4620@gmail.com
blog : http://joyhong.tistory.com

탐색

파일 로딩

In [1]:
from rdflib import Graph, RDF, URIRef

g = Graph()
g.parse("./sample_result.ttl", format='turtle')
Out[1]:
<Graph identifier=N28525a3444ce4433bbb77add0b79773d (<class 'rdflib.graph.Graph'>)>

그래프에 담긴 트리플 수

In [2]:
len(g)
Out[2]:
5678

그래프 내 데이터 탐색

In [3]:
# for s, p, o in g:   # 전체
for s, p, o in list(g)[:10]: # 처음 10개만
    print(f"{s} \t{p} \t{o}")
http://joyhong.tistory.com/resource/h_146 	http://xmlns.com/foaf/0.1/homepage 	http://www.hanseohospital.or.kr
http://joyhong.tistory.com/resource/geo_h128 	http://schema.org/longitude 	126.899838
http://joyhong.tistory.com/resource/h_73 	http://joyhong.tistory.com/ontology/totalNumberOfDoctor 	23
http://joyhong.tistory.com/resource/rg_360500 	http://www.w3.org/2000/01/rdf-schema#label 	여수시
http://joyhong.tistory.com/resource/rg_340012 	http://www.w3.org/2004/02/skos/core#prefLabel 	예산군
http://joyhong.tistory.com/resource/geo_h13 	http://schema.org/latitude 	35.12019
http://joyhong.tistory.com/resource/h_241 	http://xmlns.com/foaf/0.1/page 	http://www.hira.or.kr/re/diag/getDiagAmtInfo.do?ykiho=JDQ4MTYyMiM4MSMkMSMkOCMkODkkMzgxMzUxIzExIyQxIyQzIyQ4OSQzNjE4MzIjNDEjJDEjJDgjJDgz
http://joyhong.tistory.com/resource/h_271 	http://schema.org/geo 	http://joyhong.tistory.com/resource/geo_h271
http://joyhong.tistory.com/resource/geo_h16 	http://schema.org/postalCode 	6351
http://joyhong.tistory.com/resource/geo_h79 	http://schema.org/postalCode 	49267

predicate가 rdf:type인 트리플들만 탐색

In [4]:
for s, p, o in list(g.triples((None, RDF.type, None)))[:10]:
    print(f"{s} \t{p} \t{o}")
http://joyhong.tistory.com/resource/h_109 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_104 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_249 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_157 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_94 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_190 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/rg_110024 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://www.w3.org/2004/02/skos/core#Concept
http://joyhong.tistory.com/resource/rg_311800 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://www.w3.org/2004/02/skos/core#Concept
http://joyhong.tistory.com/resource/h_108 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_172 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital

여러 가지 탐색 방식

주어부 탐색 - 술어가 rdf:type 이고, 목적어는 아무거나(None)

In [5]:
for _type in list(g.subjects(RDF.type, None))[:10]:
    print(_type)
http://joyhong.tistory.com/resource/h_109
http://joyhong.tistory.com/resource/h_104
http://joyhong.tistory.com/resource/h_249
http://joyhong.tistory.com/resource/h_157
http://joyhong.tistory.com/resource/h_94
http://joyhong.tistory.com/resource/h_190
http://joyhong.tistory.com/resource/rg_110024
http://joyhong.tistory.com/resource/rg_311800
http://joyhong.tistory.com/resource/h_108
http://joyhong.tistory.com/resource/h_172

술어부 탐색 - 주어가 http://joyhong.tistory.com/resource/h_272 이고, 목적어는 무관

In [6]:
subject = URIRef('http://joyhong.tistory.com/resource/h_272')
In [7]:
for _pred in g.predicates(subject, None):
    print(_pred)
http://joyhong.tistory.com/ontology/openedDate
http://schema.org/telephone
http://purl.org/dc/terms/subject
http://purl.org/dc/terms/subject
http://purl.org/dc/terms/identifier
http://joyhong.tistory.com/ontology/totalNumberOfDoctor
http://schema.org/geo
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://xmlns.com/foaf/0.1/homepage
http://purl.org/dc/terms/subject
http://www.w3.org/2000/01/rdf-schema#label
http://xmlns.com/foaf/0.1/page

목적어 탐색

In [8]:
for _obj in g.objects(subject, None):
    print(_obj)
2002-05-13
031-999-1000
http://joyhong.tistory.com/resource/rg_310000
http://joyhong.tistory.com/resource/cat_11
JDQ4MTYyMiM1MSMkMSMkMCMkODkkMzgxMzUxIzExIyQxIyQ3IyQ3OSQ0NjE0ODEjODEjJDEjJDYjJDgz
83
http://joyhong.tistory.com/resource/geo_h272
http://schema.org/Hospital
http://www.gwhospital.co.kr/
http://joyhong.tistory.com/resource/rg_312300
의료법인우리의료재단김포우리병원
http://www.hira.or.kr/re/diag/getDiagAmtInfo.do?ykiho=JDQ4MTYyMiM1MSMkMSMkMCMkODkkMzgxMzUxIzExIyQxIyQ3IyQ3OSQ0NjE0ODEjODEjJDEjJDYjJDgz

목적어가 URIRef인 리소스인 경우 그 목적어를 주어로 하고 있는 모든 트리플 탐색

In [9]:
for s, p, o in list(g.triples((URIRef('http://joyhong.tistory.com/resource/h_78'), None, None))):
    print(f"{s} \t{p} \t{o}")
    if type(o)==URIRef:
        for ss, pp, oo in g.triples((o, None, None)):
            print(f"{ss} \t{pp} \t{oo}")
        
http://joyhong.tistory.com/resource/h_78 	http://joyhong.tistory.com/ontology/openedDate 	2019-03-29
http://joyhong.tistory.com/resource/h_78 	http://www.w3.org/2000/01/rdf-schema#label 	계명대학교대구동산병원
http://joyhong.tistory.com/resource/h_78 	http://joyhong.tistory.com/ontology/totalNumberOfDoctor 	35
http://joyhong.tistory.com/resource/h_78 	http://xmlns.com/foaf/0.1/page 	http://www.hira.or.kr/re/diag/getDiagAmtInfo.do?ykiho=JDQ4MTYyMiM4MSMkMSMkMCMkODkkMzgxMzUxIzExIyQxIyQzIyQ3MiQzNjEwMDIjNjEjJDEjJDgjJDgz
http://joyhong.tistory.com/resource/h_78 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://schema.org/Hospital
http://joyhong.tistory.com/resource/h_78 	http://purl.org/dc/terms/subject 	http://joyhong.tistory.com/resource/rg_230006
http://joyhong.tistory.com/resource/rg_230006 	http://www.w3.org/2000/01/rdf-schema#label 	대구중구
http://joyhong.tistory.com/resource/rg_230006 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://www.w3.org/2004/02/skos/core#Concept
http://joyhong.tistory.com/resource/rg_230006 	http://www.w3.org/2004/02/skos/core#broader 	http://joyhong.tistory.com/resource/rg_230000
http://joyhong.tistory.com/resource/rg_230006 	http://www.w3.org/2004/02/skos/core#prefLabel 	대구중구
http://joyhong.tistory.com/resource/rg_230006 	http://purl.org/dc/terms/identifier 	230006
http://joyhong.tistory.com/resource/h_78 	http://purl.org/dc/terms/subject 	http://joyhong.tistory.com/resource/rg_230000
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230004
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2000/01/rdf-schema#label 	대구
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230002
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230006
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://www.w3.org/2004/02/skos/core#Concept
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230001
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230005
http://joyhong.tistory.com/resource/rg_230000 	http://purl.org/dc/terms/identifier 	230000
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230007
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#prefLabel 	대구
http://joyhong.tistory.com/resource/rg_230000 	http://www.w3.org/2004/02/skos/core#narrower 	http://joyhong.tistory.com/resource/rg_230003
http://joyhong.tistory.com/resource/h_78 	http://schema.org/geo 	http://joyhong.tistory.com/resource/geo_h78
http://joyhong.tistory.com/resource/geo_h78 	http://schema.org/longitude 	128.5831323
http://joyhong.tistory.com/resource/geo_h78 	http://schema.org/postalCode 	41931
http://joyhong.tistory.com/resource/geo_h78 	http://schema.org/latitude 	35.8694728
http://joyhong.tistory.com/resource/geo_h78 	http://schema.org/address 	대구광역시 중구 달성로 56 계명대학교대구동산병원 (동산동)
http://joyhong.tistory.com/resource/h_78 	http://schema.org/telephone 	053-250-8013
http://joyhong.tistory.com/resource/h_78 	http://purl.org/dc/terms/subject 	http://joyhong.tistory.com/resource/cat_11
http://joyhong.tistory.com/resource/cat_11 	http://www.w3.org/2004/02/skos/core#prefLabel 	종합병원
http://joyhong.tistory.com/resource/cat_11 	http://www.w3.org/1999/02/22-rdf-syntax-ns#type 	http://www.w3.org/2004/02/skos/core#Concept
http://joyhong.tistory.com/resource/cat_11 	http://www.w3.org/2000/01/rdf-schema#label 	종합병원
http://joyhong.tistory.com/resource/cat_11 	http://purl.org/dc/terms/identifier 	11
http://joyhong.tistory.com/resource/h_78 	http://purl.org/dc/terms/identifier 	JDQ4MTYyMiM4MSMkMSMkMCMkODkkMzgxMzUxIzExIyQxIyQzIyQ3MiQzNjEwMDIjNjEjJDEjJDgjJDgz

SPARQL

SELECT 질의

In [10]:
qres = g.query("""
    SELECT ?s ?p ?o 
    WHERE { 
        ?s ?p ?o .
    } LIMIT 10
""")
for row in qres:
    print(f"{row[0]}\t{row[1]}\t{row[2]}")
http://joyhong.tistory.com/resource/h_146	http://xmlns.com/foaf/0.1/homepage	http://www.hanseohospital.or.kr
http://joyhong.tistory.com/resource/geo_h128	http://schema.org/longitude	126.899838
http://joyhong.tistory.com/resource/h_73	http://joyhong.tistory.com/ontology/totalNumberOfDoctor	23
http://joyhong.tistory.com/resource/rg_360500	http://www.w3.org/2000/01/rdf-schema#label	여수시
http://joyhong.tistory.com/resource/rg_340012	http://www.w3.org/2004/02/skos/core#prefLabel	예산군
http://joyhong.tistory.com/resource/geo_h13	http://schema.org/latitude	35.12019
http://joyhong.tistory.com/resource/h_241	http://xmlns.com/foaf/0.1/page	http://www.hira.or.kr/re/diag/getDiagAmtInfo.do?ykiho=JDQ4MTYyMiM4MSMkMSMkOCMkODkkMzgxMzUxIzExIyQxIyQzIyQ4OSQzNjE4MzIjNDEjJDEjJDgjJDgz
http://joyhong.tistory.com/resource/h_271	http://schema.org/geo	http://joyhong.tistory.com/resource/geo_h271
http://joyhong.tistory.com/resource/geo_h16	http://schema.org/postalCode	6351
http://joyhong.tistory.com/resource/geo_h79	http://schema.org/postalCode	49267
In [11]:
qres = g.query("""
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    SELECT *
    WHERE {
      ?subject rdfs:label ?label.
      ?subject foaf:homepage ?homepage.
      filter(regex(?label, "서울", "i" )) 
    }
""")
for row in qres:
    print(f"{row[0]}\t{row[1]}\t{row[2]}")
http://eunyang.imc-boram.co.kr/	http://joyhong.tistory.com/resource/h_158	서울산보람병원
http://www.brmh.org	http://joyhong.tistory.com/resource/h_162	서울특별시보라매병원
http://www.seoulmc.or.kr	http://joyhong.tistory.com/resource/h_164	서울특별시서울의료원
http://www.snuh.org	http://joyhong.tistory.com/resource/h_17	서울대학교병원
http://www.symcs.co.kr	http://joyhong.tistory.com/resource/h_150	삼육서울병원
http://www.schmc.ac.kr/seoul/kor/index.do	http://joyhong.tistory.com/resource/h_177	순천향대학교 부속 서울병원
http://www.amc.seoul.kr	http://joyhong.tistory.com/resource/h_30	재단법인아산사회복지재단 서울아산병원
http://www.paik.ac.kr/seoul/	http://joyhong.tistory.com/resource/h_295	인제대학교 서울백병원
http://www.srch.or.kr	http://joyhong.tistory.com/resource/h_160	서울적십자병원
http://www.snubh.org/index.do	http://joyhong.tistory.com/resource/h_15	분당서울대학교병원
http://www.seoulhp.co.kr/	http://joyhong.tistory.com/resource/h_238	의료법인 자산의료재단 제천서울병원
http://www.sshosp.co.kr	http://joyhong.tistory.com/resource/h_159	서울성심병원
http://www.samsunghospital.com	http://joyhong.tistory.com/resource/h_16	삼성서울병원
http://www.dbhosp.go.kr	http://joyhong.tistory.com/resource/h_161	서울특별시 동부병원
http://www.cmcseoul.or.kr/	http://joyhong.tistory.com/resource/h_40	학교법인가톨릭학원가톨릭대학교서울성모병원

ASK 질의

In [12]:
qres = g.query("""
    ASK {<http://joyhong.tistory.com/resource/h_273> ?p ?po}
""")
# for row in qres:
print(bool(qres))
True

SPARQL 로 특정 병원 찾은 후 홈페이지 열기

In [13]:
from IPython.display import IFrame

def window_open(url):
    IFrame(src=url, width='100%', height='500px')
In [14]:
qres = g.query("""
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX foaf: <http://xmlns.com/foaf/0.1/>
    SELECT ?page
    WHERE {
      ?subject rdfs:label '삼성서울병원'.
      ?subject foaf:page ?page.
    }
""")

for row in qres:
    url = row[0]

IFrame(src=url, width='100%', height='500px')
Out[14]:
In [ ]: