Folder DataCleaning renamed to DatasetMerging since it doesn't clean nothing
and instead Build the dataset
This commit is contained in:
parent
edd01a2c83
commit
ac1ed42c49
@ -1,28 +0,0 @@
|
|||||||
"""
|
|
||||||
What we have now:
|
|
||||||
|
|
||||||
Wikipeda-summary : PageId / abstract
|
|
||||||
Movies : Movie URI
|
|
||||||
Dataset : Movie URI / Relationship / Object [RDF]
|
|
||||||
Movies-PageId : Movie URI / PageId (wiki)
|
|
||||||
Reverse : Subject / Relationship / Movie URI
|
|
||||||
|
|
||||||
What we want:
|
|
||||||
( we will generate MovieID)
|
|
||||||
Movies : MovieID [PK] / Movie URI
|
|
||||||
WikiPageIDs : MovieID [PK, FK]/ PageId [IDX] (wiki) (Not important for now)
|
|
||||||
Abstracts : MovieID [PK, FK]/ abstract
|
|
||||||
Subjects : SubjectID [PK] / RDF Subject ( both from either Dataset.csv or Reverse.csv) / OriginID [FK]
|
|
||||||
Relationships : RelationshipID [PK]/ RDF Relationship (not the actual relationshi but the value)
|
|
||||||
Objects : ObjectID [PK]/ RDF Object / OriginID [FK]
|
|
||||||
Origins : OriginID [PK]/ Origin Name
|
|
||||||
RDFs : RDF_ID[PK] / MovieID [FK] / SubjectID [FK]/ RelationshipID [FK]/ ObjectID [FK]
|
|
||||||
|
|
||||||
What we will build for the model
|
|
||||||
|
|
||||||
we need RDF list for each movie together with abstract
|
|
||||||
|
|
||||||
: MovieID / RDF_set / abstrct
|
|
||||||
|
|
||||||
"""
|
|
||||||
|
|
||||||
45
Scripts/DatasetMerging/DBMerger.py
Normal file
45
Scripts/DatasetMerging/DBMerger.py
Normal file
@ -0,0 +1,45 @@
|
|||||||
|
"""
|
||||||
|
What we have now: Saved AS:
|
||||||
|
|
||||||
|
Wikipeda-summary : PageId / abstract subject,text
|
||||||
|
Movies : Movie URI "subject"
|
||||||
|
Dataset : Movie URI / Relationship / Object [RDF] subject,relationship,object
|
||||||
|
Movies-PageId : Movie URI / PageId (wiki) "subject", "object"
|
||||||
|
Reverse : Subject / Relationship / Movie URI "subject","relationship","object"
|
||||||
|
|
||||||
|
What we want:
|
||||||
|
( we will generate MovieID)
|
||||||
|
Movies : MovieID [PK] / Movie URI
|
||||||
|
WikiPageIDs : MovieID [PK, FK]/ PageId [IDX] (wiki) (Not important for now)
|
||||||
|
Abstracts : MovieID [PK, FK]/ abstract
|
||||||
|
Subjects : SubjectID [PK] / RDF Subject ( both from either Dataset.csv or Reverse.csv) / OriginID [FK]
|
||||||
|
Relationships : RelationshipID [PK]/ RDF Relationship (not the actual relationshi but the value)
|
||||||
|
Objects : ObjectID [PK]/ RDF Object / OriginID [FK]
|
||||||
|
Origins : OriginID [PK]/ Origin Name
|
||||||
|
RDFs : RDF_ID[PK] / MovieID [FK] / SubjectID [FK]/ RelationshipID [FK]/ ObjectID [FK]
|
||||||
|
|
||||||
|
What we will build for the model
|
||||||
|
|
||||||
|
we need RDF list for each movie together with abstract
|
||||||
|
|
||||||
|
: MovieID / RDF_set / abstrct
|
||||||
|
|
||||||
|
"""
|
||||||
|
|
||||||
|
import sqlite3
|
||||||
|
|
||||||
|
# Create a SQL connection to our SQLite database
|
||||||
|
con = sqlite3.connect("data/portal_mammals.sqlite")
|
||||||
|
|
||||||
|
cur = con.cursor()
|
||||||
|
|
||||||
|
# Return all results of query
|
||||||
|
cur.execute('SELECT plot_id FROM plots WHERE plot_type="Control"')
|
||||||
|
cur.fetchall()
|
||||||
|
|
||||||
|
# Return first result of query
|
||||||
|
cur.execute('SELECT species FROM species WHERE taxa="Bird"')
|
||||||
|
cur.fetchone()
|
||||||
|
|
||||||
|
# Be sure to close the connection
|
||||||
|
con.close()
|
||||||
Loading…
x
Reference in New Issue
Block a user