pandas pyyaml numpy matplotlib networkx rapidfuzz pyarrow pdfminer.six spacy rank-bm25 scikit-learn sentence-transformers faiss-cpu