Part 2: The Art of Password Cracking (with science!) – Using Readily Available Resources

Part 2: The Art of Password Cracking (with science!) – Using Readily Available Resources

February 1, 2020

In our first post about The Art of Password Cracking (with science!) we analysed the performance of as many permutations as possible.

This week, let’s first look at what results some of the dictionaries and rulesets that are readily available can generate against some of the sets of hashes with access. Here is one set scored against the four metrics we’ve identified in part 1:

  1. The percentage of the total hashes cracked by a given combination per time unit.
  2. The average percentage of passwords per keyspace unit that a combination produces on average.
  3. The percentage of password cracked by this combination alone, i.e. those not found by any other combination.
  4. Total percentage of hashes cracked upon combination completion.

In the analysis, a set of hashes from four engagements were analysed as well as the hashes found in the Staminus breach of 2016 just as a comparison. Obviously, this whole endeavour is limited by the amount of time my machine can spend cracking hashes, as well the time it takes to seek out and try new combinations of dictionaries and rulesets.

As we learn more about various combinations, we can start to exclude them before our tests start to take way too much time. First of all, let’s see which combination has the absolute highest score, i.e. metric #4:

From this initial list it is clear that the people at weakpass  as well as hashkiller have done an admirable job of cracking as much as possible from the password hashes that were exposed in the Staminus breach. Clearly a disproportionate amount of computing time has gone in the community effort of trying to recover as many hashes as possible from the public staminus breach.

Our other hash collections are not nearly as exhaustively cracked, although client002 reached an admirable 48.4% using hash killer’s dictionary in combination with an InsidePro rule file. If we ignore the staminus breach and focus on this and the other hash sets, we can see the following total cracked percentages:

Figure 2: Sorted as before by total % recovered but ignoring “staminus” entries

 

Figure 3: Idem, focusing on client001

 

Figure 4: client003

 

Figure 5: and finally, client004

The total amount of cracked hashes shows considerable variation between the different hash sets as is obvious from the above excerpts. Nonetheless, we can draw a few conclusions of the combinations that produced the highest scores in each instance.

Client003 comes into view with a custom wordlist where, I combined all passwords from the popular Seclist repository into one large file and this in combination with the “one rule that rules them all” which seems to live up to its name on this occasion. In the other three cases the top prize goes out to the hashkiller list combined with the PasswordsPro rules.

Now let’s focus on metric 2, the number of hits we get per given keyspace unit. Because keyspaces sizes vary many orders of magnitude between for example small custom lists and large masks, I’ve scored this as follows:

LOG_HR_PERC_PER_KEYSPACE=$(Rscript -e “cat( -log(${HR_PERC}/${JOB_PROGRESS_KEYSPACE})/log(10),sep=\”\\n\”)”)

This is, in other words, the negative log of the percentage of recovered passwords, divided by the total keyspace that was searched. So if 1% of the passwords was recovered by searching a million keys in the keyspace, this would generate -log(1/1,000,000) = 6. The lower a value therefore is, the greater signal to noise ratio if you will, or in other words, the “denser” the keyspace is filled with password hits.

Figure 6: All results sorted by the combination that contains the densest concentration of valid passwords (Note that some combinations appear double which is due to a minor data problem where capitalised vs non capitalised words were used, producing slightly different results).

Here we see proof that passwords are often chosen based on the environment in which they are used, e.g. the domain name or usernames they are attached to. We also see that the password list used by John: (https://www.openwall.com/john/) still remains a very good resource to use in the first instance. The number of recovered passwords is still low in absolute terms, but these types of searches take the least amount of time due to the small size of the keyspace,  so they should definitely be run in the first instance on any hashes that are recovered.

On to metric 3 where we look at which combinations produce those rare gems, namely unique passwords not recovered through any other means:

Figure 7: All combinations sorted by the % of unique passwords they generated.

As could be expected, some of the masks score very high, as they would naturally provide the only way of finding passwords that are not on any list or computed with simple rules, i.e. complex randomly generated passwords.

These figures however are also skewed slightly because there a number of similar rulesets, which if condensed into one set may also prove to uniquely find passwords by themselves. In that case however, it would be more prudent to compare all results from a particular list and compare these to all results of all other lists, and the same could be done with combinations of different types of rules. However, such comparisons will require further time and investigation and maybe covered in a later instalment.

On to the last remaining metric that was not covered yet, #1:

Figure 8: The sets sorted by how many % are recovered each second using the given combinations.

Again, John’s list scores quite well in pure passwords found per second, but it seems the client002 hash set is also especially susceptible to this. As for the others, they start to appear further down the list:

Figure 9: idem focusing in on client004.

 

And the next few entries for client001:

 

And finally, client003, further down the list:

What we learn from the above is that both John as well as the seclists collection both score high on a pure password per second measure and even the large crackstation list is on top for one of the hash sets. Still, there doesn’t appear to be a universal best combination that can be identified.

In the next part of the series we will try to generate custom lists and see what takes place. These will include lists and masks based on passwords already recovered through all previous methods, using language-specific password lists as well as an initial attempt at creating smarter masks using custom characters to target various date formats.

Share with your network

Related Articles
  • Server-Side Request Forgery Demo

  • 2nd Update: SameSite cookie changes

  • Introduction to Server-Side Request Forgery