Christian Risi
|
1797571bb2
|
Added test to see if illegal tokens were included in target
|
2025-10-06 16:17:12 +02:00 |
|
Christian Risi
|
e93710af08
|
Fixed illegal tokens being added in target output
|
2025-10-06 16:16:47 +02:00 |
|
Christian Risi
|
d3bba9b944
|
Added actual test
|
2025-10-06 16:06:17 +02:00 |
|
Christian Risi
|
b1e7af0607
|
Merge branch 'dev.embedder' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev.embedder
|
2025-10-06 15:55:44 +02:00 |
|
Christian Risi
|
d3b1f7da91
|
Added testing for spanned masking
|
2025-10-06 15:55:40 +02:00 |
|
Christian Risi
|
c217f5dec9
|
Added 2 types of masking
|
2025-10-06 15:45:45 +02:00 |
|
Christian Risi
|
49f0beb6ea
|
Updated imports
|
2025-10-06 15:45:28 +02:00 |
|
GassiGiuseppe
|
05bb460999
|
file to test batch attention mask
|
2025-10-06 13:03:20 +02:00 |
|
GassiGiuseppe
|
948c3fd7ac
|
update to batch attention mask
|
2025-10-06 13:03:03 +02:00 |
|
GassiGiuseppe
|
87409fecd5
|
added method fot batched attention_mask
|
2025-10-06 12:00:11 +02:00 |
|
GassiGiuseppe
|
7e40a36701
|
wip: NanoSocratesCore
|
2025-10-05 22:58:06 +02:00 |
|
GassiGiuseppe
|
d48815cca2
|
added task_type and updated init
|
2025-10-05 18:58:42 +02:00 |
|
GassiGiuseppe
|
0f243eaac2
|
added padding_mask entry to decoder and encoder
|
2025-10-05 18:46:06 +02:00 |
|
GassiGiuseppe
|
9c83d9fa71
|
Merge branch 'dev.embedder' of https://repositories.communitynotfound.work/PoliBa-DeepLearning/NanoSocrates into dev.embedder
|
2025-10-05 18:45:33 +02:00 |
|
Christian Risi
|
a693cbb77e
|
A set of utils for our pipeline
|
2025-10-05 18:37:43 +02:00 |
|
GassiGiuseppe
|
6f219f634f
|
Added attention_mask
|
2025-10-05 17:49:01 +02:00 |
|
GassiGiuseppe
|
b303affd18
|
updated uml of the model
|
2025-10-05 16:40:19 +02:00 |
|
Christian Risi
|
53c4decac7
|
Added playgrounds for the architecture
|
2025-10-05 16:30:23 +02:00 |
|
Christian Risi
|
c60da8ba82
|
Refactoring
|
2025-10-05 15:40:29 +02:00 |
|
Christian Risi
|
3b5e6c099c
|
Merge branch 'dev' into dev.embedder
|
2025-10-05 11:17:09 +02:00 |
|
Christian Risi
|
ba3a718480
|
Merge branch 'dev.etl' into dev
|
2025-10-05 11:16:54 +02:00 |
|
GassiGiuseppe
|
69fba7c3e9
|
new utility to generate a csv debug file of the output of the pipeline
|
2025-10-04 21:33:09 +02:00 |
|
GassiGiuseppe
|
76200d936d
|
added first classes (Encoder, Decoder, Attention) for the model
|
2025-10-04 21:07:58 +02:00 |
|
Christian Risi
|
9b656e7918
|
Added a playground to test the embedding phase
|
2025-10-04 19:43:42 +02:00 |
|
Christian Risi
|
9a797a0485
|
Added embedder code for "Attention is all you need"
|
2025-10-04 19:43:25 +02:00 |
|
Christian Risi
|
3b274ad807
|
Added a way to take the default special token list
|
2025-10-04 19:43:02 +02:00 |
|
Christian Risi
|
8f5e2f2f0d
|
Modifications
|
2025-10-04 19:42:45 +02:00 |
|
Christian Risi
|
da0bdf703b
|
Added a way to see vocabulary size
|
2025-10-04 19:42:29 +02:00 |
|
Christian Risi
|
03cdca1f00
|
Modified imports for BPE
|
2025-10-04 19:42:02 +02:00 |
|
Christian Risi
|
7188c8678a
|
Added imports for Embedder
|
2025-10-04 19:41:48 +02:00 |
|
Christian Risi
|
1eef25a697
|
Merge branch 'dev' into dev.embedder
|
2025-10-04 19:04:03 +02:00 |
|
Christian Risi
|
e9165fb146
|
Merge branch 'dev.bpe' into dev
|
2025-10-04 19:03:09 +02:00 |
|
GassiGiuseppe
|
bbadd4c521
|
update cleaning pipeline with a new method to filter also by number of films,
also updated the signature of the pipeline
|
2025-10-04 19:00:05 +02:00 |
|
GassiGiuseppe
|
c2f9344c82
|
little test file
|
2025-10-04 18:58:20 +02:00 |
|
GassiGiuseppe
|
25f3a5d221
|
Logic to test BPE
|
2025-10-04 18:58:04 +02:00 |
|
Christian Risi
|
e8ff82c40a
|
Updated with tasks architectures
|
2025-10-04 10:57:12 +02:00 |
|
Christian Risi
|
23d1eaf99e
|
Fixed a rare bug over training multiple times
|
2025-10-04 10:47:39 +02:00 |
|
Christian Risi
|
25a6ad1254
|
Added model high level architecture
|
2025-10-03 23:37:16 +02:00 |
|
Christian Risi
|
460d4f5188
|
Renamed directory to Playgrounds
|
2025-10-03 22:59:43 +02:00 |
|
Christian Risi
|
c6ac6df2c2
|
Added stubs for other libraries
|
2025-10-03 20:28:23 +02:00 |
|
Christian Risi
|
15baba54ab
|
Sanity check to autodetect Device
|
2025-10-03 20:16:01 +02:00 |
|
Christian Risi
|
87f24878f4
|
Added shims for utils on using Pytorch
|
2025-10-03 20:11:14 +02:00 |
|
Christian Risi
|
999141f886
|
Merge branch 'dev' into dev.embedder
|
2025-10-03 18:08:34 +02:00 |
|
Christian Risi
|
8e095ebb7a
|
Added papers stub
|
2025-10-03 18:02:27 +02:00 |
|
Christian Risi
|
149deb407d
|
added cache directories
|
2025-10-03 18:01:05 +02:00 |
|
Christian Risi
|
8a21cb1b73
|
added python analysis
|
2025-10-03 18:00:52 +02:00 |
|
Christian Risi
|
d2a3dfe90f
|
Fixed bug
|
2025-10-03 17:59:46 +02:00 |
|
GassiGiuseppe
|
0f95aeb122
|
toy dictionary for bpe implemeted
|
2025-10-03 16:26:01 +02:00 |
|
Christian Risi
|
0ee6e48004
|
Fixed the same bug as before, but this time is correct
|
2025-10-03 16:09:53 +02:00 |
|
Christian Risi
|
55e0d2ac23
|
Fixed a encoding bug
|
2025-10-03 16:08:11 +02:00 |
|