duplicated block id: 1 size: 9 cleaned lines of code in 2 files: - collection/download_commoncrawl_passages.py (241:250) - collection/paragraph_chunker.py (91:100)