1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114
| package Interview;
import java.io.BufferedReader; import java.io.BufferedWriter; import java.io.IOException; import java.nio.file.Files; import java.nio.file.Paths; import java.util.*; import java.util.stream.Collectors;
public class IPCount {
private static final int NUMBER_OF_FILES = 100; private static final int topN = 100;
public static void splitLargeFile(String inputFilePath) throws IOException { BufferedWriter[] writers = new BufferedWriter[NUMBER_OF_FILES]; for (int i = 0; i < NUMBER_OF_FILES; i++) { writers[i] = Files.newBufferedWriter(Paths.get(inputFilePath + "_part_" + i + ".txt")); }
try (BufferedReader reader = Files.newBufferedReader(Paths.get(inputFilePath))) { String line; while ((line = reader.readLine()) != null) { int fileIndex = Math.abs(line.hashCode() % NUMBER_OF_FILES); writers[fileIndex].write(line); writers[fileIndex].newLine(); } }
for (BufferedWriter writer : writers) { writer.close(); } }
public static Map<String, Integer> countIPsInFile(String fileName) throws IOException { Map<String, Integer> ipCount = new HashMap<>();
try (BufferedReader reader = Files.newBufferedReader(Paths.get(fileName))) { String line; while ((line = reader.readLine()) != null) { ipCount.put(line, ipCount.getOrDefault(line, 0) + 1); }
}
return ipCount; }
public static List<Map.Entry<String, Integer>> findTopIPs(Map<String, Integer> ipMap) {
PriorityQueue<Map.Entry<String, Integer>> ipQueue = new PriorityQueue<>((o1, o2) -> (int) o1.getValue() - (int) o2.getValue());
for (Map.Entry<String, Integer> entry : ipMap.entrySet()) { ipQueue.offer(entry); if (ipQueue.size() > topN) ipQueue.poll(); }
return new ArrayList<>(ipQueue); }
public static List<Map.Entry<String, Integer>> mergeTopIps(List<List<Map.Entry<String, Integer>>> topIpList) { PriorityQueue<Map.Entry<String, Integer>> ipQueue = new PriorityQueue<>((o1, o2) -> (int) o1.getValue() - (int) o2.getValue());
for (List<Map.Entry<String, Integer>> list : topIpList) { for (Map.Entry<String, Integer> ipEntry : list) { ipQueue.offer(ipEntry); if (ipQueue.size() > topN) { ipQueue.poll(); } } }
return new ArrayList<>(ipQueue); }
public static void saveTopIP(List<Map.Entry<String, Integer>> ipList) throws IOException { BufferedWriter writer = new BufferedWriter(Files.newBufferedWriter(Paths.get("top_100_ip.txt"))); for (Map.Entry ip : ipList) { System.out.println(ip.getKey() + "\t" + ip.getValue()); writer.write(ip.getKey() + "\t" + ip.getValue()); } }
public static void main(String[] args) throws IOException { String largeFile = "ip_log.txt"; splitLargeFile(largeFile);
List<List<Map.Entry<String, Integer>>> partialTopIPList = new ArrayList<>();
for (int i = 0; i < NUMBER_OF_FILES; i++) { Map<String, Integer> counts = countIPsInFile(largeFile + "_part_" + i + ".txt"); partialTopIPList.add(findTopIPs(counts)); counts.clear(); }
List<Map.Entry<String, Integer>> topIpList = mergeTopIps(partialTopIPList); Collections.sort(topIpList, (o1, o2) -> { return (int) o2.getValue() - (int) o1.getValue(); }); saveTopIP(topIpList); } }
|