Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different understanding about the original paper #4

Open
lanzhuzhu opened this issue Jun 12, 2018 · 3 comments
Open

Different understanding about the original paper #4

lanzhuzhu opened this issue Jun 12, 2018 · 3 comments

Comments

@lanzhuzhu
Copy link

In README, you said: “The model produces a test F1_score of 90.9 % with ~70 epochs. The results produced in the paper for the given architecture is 91.14 ” In fact, the paper said the result 91.14 is produced under the situation "All other hyper-parameters and features remain the same as our best model in Table 5", that is , lex feature is used, while you do not use that feature, so this architecture can not reach 91.14.

@kamalkraj
Copy link
Owner

kamalkraj commented Jun 12, 2018 via email

@davidsbatista
Copy link

davidsbatista commented Nov 12, 2018

Did you use Viterbi to do the decoding of the best sequence ? That might also explain the different results

@shuiyueche
Copy link

Did you use Viterbi to do the decoding of the best sequence ? That might also explain the different results

If I am not wrong, the code here does not include the transition matrix of the tags. So no need to apply Viterbi here. But this is also a big difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants