[aklug] Re: [OT] Re: random bits vs random Hex

From: James <marblemunkey@gmail.com>
Date: Wed May 29 2013 - 13:52:29 AKDT

Just to play Devil's Advocate, any truly random stream of arbitrary
length certainly _could_ technically have patterns of some smaller
length _somewhere_ within it, even if it is extremely unlikely..

On Wed, May 29, 2013 at 5:17 PM, Doug Davey <doug.davey@gmail.com> wrote:
> Bryan is right, if you have saved the binary data poorly, in an odd way,
> there is potential for fluff that would be easily be trimmed by a
> compression algorithm.
>
> As for pattern recognition, a random stream will hopefully have no patterns,
> so further compression won't work. If it does then the stream wasn't
> random.
>
>
> On Wed, May 29, 2013 at 1:12 PM, <bryanm@acsalaska.net> wrote:
>>
>> On Wed, May 29, 2013 12:26 pm, Arthur Corliss wrote:
>> > On Wed, 29 May 2013, bryanm@acsalaska.net wrote:
>> >
>> >> I don't know enough to address entropy, but I can say that changing
>> >> from binary triplets to decimal digits leaves some of the pattern space
>> >> unused (i.e. 8 and 9). In other words, the same data takes up more
>> >> space,
>> >> leaving open the possibility for an algorithm to compress it back to
>> >> close
>> >> to its original size.
>> >
>> > If this were true that we'd be able to get great compression on any
>> > data,
>> > random or not. By your logic, compressing binary data should be
>> > awesome,
>> > since there's only two choices: 1 or 0.
>>
>> That's not what I'm saying at all. I'm talking about unused pattern space.
>> As an extreme example, imagine representing binary data by letting each
>> *byte* represent either a 0 or a 1. Obviously, there would be tremendous
>> opportunity for compression. The same thing happens (to a lesser degree)
>> in my binary triplet -> decimal digit conversion. In each case, there
>> are some possible values for each data element that will *never* be used.
>>
>> I'm speaking mathematically, and don't claim to know how to implement an
>> algorithm to take advantage of this property.
>>
>> > You can't cheat around the basic problem of pattern recognition by
>> > changing
>> > how the same data is presented. Choosing to evaluate smaller chunks of
>> > data
>> > is a zero sum game because you either have to inflate your translation
>> > maps
>> > or look for longer pattern strings than you would in larger chunks. In
>> > the
>> > end, it's the repeatability of data chunks, regardless of presentation,
>> > which will determine compressability.
>>
>> The idea of pattern recognition for the purpose of data compression
>> intrigues me, though I've never fully researched the details.
>>
>> --
>> Bryan Medsker
>> bryanm@acsalaska.net
>>
>> ---------
>> To unsubscribe, send email to <aklug-request@aklug.org>
>> with 'unsubscribe' in the message body.
>>
>
---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Wed May 29 13:52:55 2013

This archive was generated by hypermail 2.1.8 : Wed May 29 2013 - 13:52:55 AKDT