THE INTEGRATION OF BANKING AND TELECOMMUNICATIONS: THE NEED FOR REGULATORY REFORM

436 JOURNAL OF LAW AND POLICY

The models based on word features achieve their best
performance at about 400 features, far lower than the character
n-gram representation. However, the performance of the models
based on more features does not drop dramatically as happens in
cross-topic experiments. Again, this confirms the above
conclusion about the usefulness of thematic-related information

Figure 5: Performance of the cross-topic attribution models (training on Politics, test on UK).

Figure 6: Performance of the cross-genre attribution models (training on Politics, test on Book reviews).

30

40

50

60

70

80

90

100

0 5000 10000 15000

Accuracy (%)

Features

Words

Char 3- grams