1998-08-06 - Re: text analysis

Header Data

From: Mok-Kong Shen <mok-kong.shen@stud.uni-muenchen.de>
To: cypherpunks@toad.com
Message Hash: add19c61504a48a623319a6c2312b23fc9bd2b112e125f05eafcf3cbd45403d7
Message ID: <35C99712.A1572FA0@stud.uni-muenchen.de>
Reply To: <Pine.LNX.3.96.980806105040.20223u-100000@freenet.bishkek.su>
UTC Datetime: 1998-08-06 11:44:34 UTC
Raw Date: Thu, 6 Aug 1998 04:44:34 -0700 (PDT)

Raw message

From: Mok-Kong Shen <mok-kong.shen@stud.uni-muenchen.de>
Date: Thu, 6 Aug 1998 04:44:34 -0700 (PDT)
To: cypherpunks@toad.com
Subject: Re: text analysis
In-Reply-To: <Pine.LNX.3.96.980806105040.20223u-100000@freenet.bishkek.su>
Message-ID: <35C99712.A1572FA0@stud.uni-muenchen.de>
MIME-Version: 1.0
Content-Type: text/plain


CyberPsychotic wrote:

> text). Anyways, when things come to 2 characters set, i have to get 1024
> character set, and so on, which looks quite unreasonable to me to allocate
> memory for elements, which probably will be never found in text... I was
> thinking of other solution and came to two way connected lists (correct
> term?)  things, i.e. : i have some structure like:
> 
> struct element {
> char value[ELEMENT_LENGTH];
> unsigned int frequency;
> struct element *previous;
> struct element *next;
> }
>  and could dinamically allocate memory for each new found element, but
> this would slow down whole code by the time list of new elements grow up.

I think currently memory is cheap enough so that you could do
frequency counts of at least trigrams with one dimensional array.

M. K. Shen

Thread

Return to August 1998
Return to “Bill Stewart <bill.stewart@pobox.com>”
Return to “CyberPsychotic <fygrave@freenet.bishkek.su>”
Return to “Mok-Kong Shen <mok-kong.shen@stud.uni-muenchen.de>”
1998-08-06 (Wed, 5 Aug 1998 22:10:51 -0700 (PDT)) - text analysis - CyberPsychotic <fygrave@freenet.bishkek.su>
- 1998-08-06 (Thu, 6 Aug 1998 04:44:34 -0700 (PDT)) - Re: text analysis - Mok-Kong Shen <mok-kong.shen@stud.uni-muenchen.de>
- 1998-08-07 (Thu, 6 Aug 1998 23:12:56 -0700 (PDT)) - Re: text analysis - Bill Stewart <bill.stewart@pobox.com>
  - 1998-08-09 (Sun, 9 Aug 1998 03:29:30 -0700 (PDT)) - Re: text analysis - CyberPsychotic <fygrave@freenet.bishkek.su>

cryptoanarchy.wiki

1998-08-06 - Re: text analysis

Header Data

Raw message

Thread