Created file for the db DatawareHouse
Also decided firsts schema models into DBMerger
This commit is contained in:
parent
faaba17a98
commit
0d30e90ee0
0
Assets/Dataset/DatawareHouse/dataset.db
Normal file
0
Assets/Dataset/DatawareHouse/dataset.db
Normal file
28
Scripts/DataCleaning/DBMerger.py
Normal file
28
Scripts/DataCleaning/DBMerger.py
Normal file
@ -0,0 +1,28 @@
|
||||
"""
|
||||
What we have now:
|
||||
|
||||
Wikipeda-summary : PageId / abstract
|
||||
Movies : Movie URI
|
||||
Dataset : Movie URI / Relationship / Object [RDF]
|
||||
Movies-PageId : Movie URI / PageId (wiki)
|
||||
Reverse : Subject / Relationship / Movie URI
|
||||
|
||||
What we want:
|
||||
( we will generate MovieID)
|
||||
Movies : MovieID [PK] / Movie URI
|
||||
WikiPageIDs : MovieID [PK, FK]/ PageId [IDX] (wiki) (Not important for now)
|
||||
Abstracts : MovieID [PK, FK]/ abstract
|
||||
Subjects : SubjectID [PK] / RDF Subject ( both from either Dataset.csv or Reverse.csv) / OriginID [FK]
|
||||
Relationships : RelationshipID [PK]/ RDF Relationship (not the actual relationshi but the value)
|
||||
Objects : ObjectID [PK]/ RDF Object / OriginID [FK]
|
||||
Origins : OriginID [PK]/ Origin Name
|
||||
RDFs : RDF_ID[PK] / MovieID [FK] / SubjectID [FK]/ RelationshipID [FK]/ ObjectID [FK]
|
||||
|
||||
What we will build for the model
|
||||
|
||||
we need RDF list for each movie together with abstract
|
||||
|
||||
: MovieID / RDF_set / abstrct
|
||||
|
||||
"""
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user