Skip to main content

Discord Chat Analyzer

GitLab Repository

The DCA operates on statically available chat log files, downloaded by the DiscordChatExporter.

Architecture and Output

After parsing all .json logs to Java Entities, separated into an array of Channel objects, each gets analyzed by the Analyzer.java. The analyzer reads each message and tracks a multitude of information for each specific author/chatter, automatically sorted by TreeMaps.

Based on the rankingType, the analyzer writes its data to separate files.

RankingType.java
package analyzer.models.ranking;

public enum RankingType {
MOST_MESSAGES,
MOST_EMBEDS,
MOST_ATTACHMENTS,
TIMES_MENTIONED,
ACCOUNT_AGE,
MOST_COMMON_REACTION,
AVG_WORD_COUNT
}

The data is aggregated and analyzed by an implementation of Ranking.java.

Ranking.java
import analyzer.stats.AuthorData;
import lombok.Getter;
import lombok.Setter;

import java.util.List;

public abstract class Ranking {

@Getter
@Setter
private transient List<AuthorData> authorDataList;

public Ranking(List<AuthorData> authorDataList) {
this.authorDataList = authorDataList;
}

public String getOutputFilePath() {
return "logs/not-implemented.json";
}
}

Ranking implementations

Program Sequence

Entity Models

By manually analyzing the available fields in the json logs, a separate pojo / entity file has been created for each object. Using Lombok, the Pojos can be kept fairly small. This could be reduced even further, by implementing each entity as a Java 17 record.

Each server (Guild) has multiple Channels. Each channel has ChannelInfos and n messages.

Channel.java
import analyzer.models.DateRange;
import analyzer.models.Guild;
import analyzer.models.message.Message;
import lombok.Getter;
import lombok.Setter;

@Getter
@Setter
public class Channel {

private Guild guild;
private ChannelInfo channel;
private DateRange dateRange;
private Message[] messages;
}
ChannelInfo.java
import lombok.Getter;
import lombok.Setter;

@Getter
@Setter
public class ChannelInfo {
private String id;
private String type;
private String categoryId;
private String category;
private String name;
private String topic;
}
Message.java
import analyzer.models.Author;
import analyzer.models.message.embed.Embed;
import analyzer.models.message.reaction.Reaction;
import lombok.Getter;
import lombok.Setter;

@Getter
@Setter
public class Message {
private String id;
private String type;
private String timestamp;
private String timestampEdited;
private String callEndedTimestamp;
private boolean isPinned;
private String content;
private Author author;
private Attachment[] attachments;
private Embed[] embeds;
private Reaction[] reactions;
private Mention[] mentions;
private Reference reference;
}

Load JSON

We instantiate a Gson object and read all json log files in parallel.

Main.java
private static List<Channel> parseJsonToChannels() {
final Gson gson = new GsonBuilder().setDateFormat(DateFormat.FULL, DateFormat.FULL).create();

final List<Channel> channels = new ArrayList<>();
final List<String> logs = readLogs();

logs.parallelStream().forEach(logFilePath -> {
try (Reader reader = Files.newBufferedReader(Paths.get(logFilePath))) {
final Channel channel = gson.fromJson(reader, Channel.class);
channels.add(channel);
} catch (Exception ex) {
ex.printStackTrace();
}
});

return channels;
}

Data Aggregation

The loaded log data gets send through multiple methods of data aggregation and preparation. For each single author / chatter found.

populateAuthorDataMap()
private void populateAuthorDataMap(AuthorData authorData, Message message) {
analyzeMessage(authorData, message);
authorData.setAuthorId(message.getAuthor().getId());
authorData.setAuthor(message.getAuthor());
authorData.setEarliestLocalDate(message.getTimestamp());
authorDataMap.put(message.getAuthor(), authorData);
}
analyzeMessage()
private void analyzeMessage(AuthorData authorData, Message message) {
authorData.incrementMessages();
analyzeContent(authorData, message);
analyzeEmbeds(authorData, message);
analyzeAttachments(authorData, message);
analyzeMentions(authorData, message);
analyzeReactions(authorData, message.getReactions());
}

The analyzeMentions method for example counts the amount of times, the current chatter has been mentioned (e.g. @UserXYZ).

analyzeMentions()
private void analyzeMentions(AuthorData authorData, Message message) {
final Mention[] mentions = message.getMentions();
if (mentions != null && mentions.length > 0) {
authorData.incrementTimesMentioned();
}
}

Each Author collects its own data, which get updated during the analyzation process.

AuthorData.java
@Getter
@Setter
public class AuthorData {

private transient Author author;
private transient LocalDate earliestLocalDate;
private transient List<Integer> wordsPerMessage = new ArrayList<>();

private String authorId;
private String firstMessageSent;
private double averageWordsPerMessage;
private long messagesSent = 0;
private long embedsSent = 0;
private long attachmentsSent = 0;
private long sumEmojisReceived = 0;
private long timesMentioned = 0;
private Map<Emoji, Integer> emojisReceived = new TreeMap<>(new Emoji.EmojiComparator());

// [...]
}

Statistics Generation

Each data entry gets fed to a Java TreeMap which integrates a custom comparator. The TreeMap is then automatically sorted according to the compare method written. After all data has been ingested, the resulting TreeMap is already in the correct ranked order.

For example, the author, who has been mentioned the most, is on index 0.

Analyzer.java
@Nullable
public Ranking getRanking(RankingType rankingType) {
Ranking result = null;

switch (rankingType) {
// [...]
case TIMES_MENTIONED:
result = new TimesMentionedRanking(new LinkedList<>(authorDataMap.values()));
break;
}

return result;
}
calculateMentionRanking()
private void calculateMentionRanking(List<AuthorData> authorDataList) {
timesMentioned = new TreeMap<>(new AuthorData.AuthorDataMentionsCountComparator());
authorDataList.forEach(authorData -> timesMentioned.put(authorData, authorData.getTimesMentioned()));
}

Account Age

Author on the analyzed server with the oldest account / date of first message sent.

Comparator
public static class AuthorDataFirstMessageComparator implements Comparator<AuthorData> {
@Override
public int compare(AuthorData o1, AuthorData o2) {
final int compare = o1.getEarliestLocalDate().compareTo(o2.getEarliestLocalDate());
if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateAccountAgeRanking(List<AuthorData> authorDataList) {
joinedServer = new TreeMap<>(new AuthorData.AuthorDataFirstMessageComparator());
authorDataList.stream()
.filter(authorData -> authorData.getMessagesSent() >= 10)
.forEach(authorData -> joinedServer.put(authorData, authorData.getLocalDateAsString(authorData.getEarliestLocalDate())));
}

Average Words per Message

Comparator
public static class AvgWordCountComparator implements Comparator<Double> {
@Override
public int compare(Double o1, Double o2) {
final int compare = o2.compareTo(o1);

if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateAvgWordCount(List<AuthorData> authorDataList) {
authorDataList
.stream()
.filter(authorData -> authorData.getMessagesSent() >= 10)
.forEach(authorData -> {
final double wordCountSum = authorData.getWordCountSum();
final double messagesSent = authorData.getMessagesSent();

if (wordCountSum > 0 && messagesSent > 0) {
authorData.setAverageWordsPerMessage(round(wordCountSum / messagesSent));
averageWordsPerMessage.put(authorData.getAverageWordsPerMessage(), authorData.getAuthor().getNickname());
}
});
}

Attachments added

Comparator
public static class AuthorDataAttachmentsCountComparator implements Comparator<AuthorData> {
@Override
public int compare(AuthorData o1, AuthorData o2) {
final Long attachmentsSent1 = o1.getAttachmentsSent();
final Long attachmentsSent2 = o2.getAttachmentsSent();
final int compare = attachmentsSent2.compareTo(attachmentsSent1);

if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateAttachments(List<AuthorData> authorDataList) {
mostAttachments = new TreeMap<>(new AuthorData.AuthorDataAttachmentsCountComparator());
authorDataList.forEach(authorData -> mostAttachments.put(authorData, authorData.getAttachmentsSent()));
}

private void countAttachments(List<AuthorData> authorDataList) {
for (AuthorData authorData : authorDataList) {
attachmentsSent = attachmentsSent + authorData.getAttachmentsSent();
}
}

Embeds added

Comparator
public static class AuthorDataEmbedsCountComparator implements Comparator<AuthorData> {
@Override
public int compare(AuthorData o1, AuthorData o2) {
final Long embedsSent1 = o1.getEmbedsSent();
final Long embedsSent2 = o2.getEmbedsSent();
final int compare = embedsSent2.compareTo(embedsSent1);

if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateEmbeds(List<AuthorData> authorDataList) {
mostEmbeds = new TreeMap<>(new AuthorData.AuthorDataEmbedsCountComparator());
authorDataList.forEach(authorData -> mostEmbeds.put(authorData, authorData.getEmbedsSent()));
}

private void countEmbeds(List<AuthorData> authorDataList) {
for (AuthorData authorData : authorDataList) {
embedsSent = embedsSent + authorData.getEmbedsSent();
}
}

Emoji Reaction count

Comparator
public static class EmojiCountComparator implements Comparator<Emoji> {
@Override
public int compare(Emoji o1, Emoji o2) {
final int compare = o2.count.compareTo(o1.count);

if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateMostCommonReaction(List<AuthorData> authorDataList) {
Map<Emoji, Integer> emojiCount = new HashMap<>();
authorDataList.forEach(authorData -> {
final Map<Emoji, Integer> emojisReceived = authorData.getEmojisReceived();
emojisReceived.forEach((key, value) -> {
if (emojiCount.containsKey(key)) {
emojiCount.put(key, emojiCount.get(key) + value);
} else {
emojiCount.put(key, value);
}
});
});

// write emoji count
for (Map.Entry<Emoji, Integer> emoji : emojiCount.entrySet()) {
emoji.getKey().setCount(emoji.getValue());
}
mostCommonReaction.putAll(emojiCount);
}

Messages sent

Comparator
public static class AuthorDataMessagesCountComparator implements Comparator<AuthorData> {
@Override
public int compare(AuthorData o1, AuthorData o2) {
final Long messagesSent1 = o1.getMessagesSent();
final Long messagesSent2 = o2.getMessagesSent();
final int compare = messagesSent2.compareTo(messagesSent1);

if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateMessageRanking(List<AuthorData> authorDataList) {
mostMessages = new TreeMap<>(new AuthorData.AuthorDataMessagesCountComparator());
authorDataList.forEach(authorData -> mostMessages.put(authorData, authorData.getMessagesSent()));
}

private void countMessages(List<AuthorData> authorDataList) {
for (AuthorData authorData : authorDataList) {
messagesSent = messagesSent + authorData.getMessagesSent();
}
}

Times Mentioned

Comparator
public static class AuthorDataMentionsCountComparator implements Comparator<AuthorData> {
@Override
public int compare(AuthorData o1, AuthorData o2) {
final Long timesMentioned1 = o1.getTimesMentioned();
final Long timesMentioned2 = o2.getTimesMentioned();
final int compare = timesMentioned2.compareTo(timesMentioned1);

if (compare == 0) {
return 1;
} else {
return compare;
}
}
}
Analysis
private void calculateMentionRanking(List<AuthorData> authorDataList) {
timesMentioned = new TreeMap<>(new AuthorData.AuthorDataMentionsCountComparator());
authorDataList.forEach(authorData -> timesMentioned.put(authorData, authorData.getTimesMentioned()));
}

private void countMentions(List<AuthorData> authorDataList) {
for (AuthorData authorData : authorDataList) {
countMentions = countMentions + authorData.getTimesMentioned();
}
}

UML

Zoom in to read

Download UML Image