Name		Name	Last commit message	Last commit date
parent directory ..
bert		bert
LICENSE		LICENSE
README.md		README.md
bert_sum_ext.py		bert_sum_ext.py
bert_sum_ext_utils.py		bert_sum_ext_utils.py
cluster_features.py		cluster_features.py
sample.txt		sample.txt

README.md

Leveraging BERT for Extractive Text Summarization on Lectures for Japanese

Get a feature vector for each sentence with Japanese BERT from the text. Cluster the feature vectors. Display the cluster center points as a summary.

Input

A Japanese text file.

sample.txt

Output

基盤モデルの概要
基盤モデル（Foundation Model）とは、大量のデータから学習することで、高い汎化性能を獲得したAIのことです。
特に、基盤モデルはデータセットが巨大であるため、ConvolutionよりもVision Transformerを使用する方が性能が高くなっています。
当面、エッジでの計算リソースの関係で、基盤モデルの活用は限定的になる可能性もありますが、計算リソースはハードウェアの進化と共に、増加していくため、どこかのタイミングで基盤モデルが席巻するものと考えられます。

Top NUM_PREDICTS extracted summary statements.
NUM_PREDICTS is defined in bert_sum_ext.py

Usage

Automatically downloads the onnx and prototxt files on the first run. It is necessary to be connected to the Internet while downloading.

For the sample japanese text file,

$ python3 bert_sum_ext.py

If you want to specify the input text file, put the text file path after the -f option.

$ python3 bert.py -f other.txt

Reference

BERT Extractive Summarizer
日本語BERT

Framework

PyTorch

Model Format

ONNX opset = 11

Netron

bert-base.onnx.prototxt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bert_sum_ext

bert_sum_ext

README.md

Leveraging BERT for Extractive Text Summarization on Lectures for Japanese

Input

Output

Usage

Reference

Framework

Model Format

Netron

Files

bert_sum_ext

Directory actions

More options

Directory actions

More options

Latest commit

History

bert_sum_ext

Folders and files

parent directory

README.md

Leveraging BERT for Extractive Text Summarization on Lectures for Japanese

Input

Output

Usage

Reference

Framework

Model Format

Netron