Tuesday, January 4, 2011

Cartesian

I see your hacky method overloading ...

Luckily with Python you can implement your own hacky little overloads.


from itertools import product

def argslist(args):
return [ tuple(t) for t in product(*[a.split('|') for a in args]) ]

class Signatures(defaultdict):
def __init__(self):
self.seen = defaultdict(set)
defaultdict.__init__(self, list)

def __setitem__(self, key, item):
if isinstance(key, tuple):
m, args = key
for k in argslist(args):self.seen[m].add(k)
self[m].append(item)
else:
defaultdict.__setitem__(self, key, item)

def __contains__(self, (m, args)):
for k in argslist(args):
if k in self.seen[m]:
return True


def parse_sigs():
api = PyQuery(filename='api.xml')
sigs = Signatures()

for sig in api('entries entry[type=method] signature'):
entry = sig.getparent()
name = entry.get('name')

for args in arg_combinations(PyQuery(sig)('argument')):
k, v = parse_sig(name, args)
v.update({'sig': PyQuery(sig)})

if k:
key = (k, tuple(v['types']))

if key not in sigs:
sigs[key] = v

return dict(sigs)

'toggle': [{'args': ['handler(eventObject)', 'handler(eventObject)'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(handler(eventObject), handler(eventObject))',
'types': ['Function', 'Function']},
{'args': ['handler(eventObject)',
'handler(eventObject)',
'handler(eventObject)'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(handler(eventObject), handler(eventObject), handler(eventObject))',
'types': ['Function', 'Function', 'Function']},
{'args': [],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle()',
'types': []},
{'args': ['duration'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(duration)',
'types': ['String']},
{'args': ['callback'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(callback)',
'types': ['Callback']},
{'args': ['duration', 'callback'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(duration, callback)',
'types': ['String', 'Callback']},
{'args': ['duration', 'easing'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(duration, easing)',
'types': ['String', 'String']},
{'args': ['duration', 'easing', 'callback'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(duration, easing, callback)',
'types': ['String', 'String', 'Callback']},
{'args': ['showOrHide'],
'name': 'toggle',
'sig': [<signature>],
'sig_str': 'toggle(showOrHide)',
'types': ['Boolean']}],

I haven't written tests (in javascript) but those sigs look fairly legit

Method (Overload) Madness

    _toggle: jQuery.fn.toggle,

toggle: function( fn, fn2, callback ) {
var bool = typeof fn === "boolean";

if ( jQuery.isFunction(fn) && jQuery.isFunction(fn2) ) {
this._toggle.apply( this, arguments );

} else if ( fn == null || bool ) {
this.each(function() {
var state = bool ? fn : jQuery(this).is(":hidden");
jQuery(this)[ state ? "show" : "hide" ]();
});

} else {
this.animate(genFx("toggle", 3), fn, fn2, callback);
}

return this;
},


....

animate: function( prop, speed, easing, callback ) {
var optall = jQuery.speed(speed, easing, callback);

if ( jQuery.isEmptyObject( prop ) ) {
return this.each( optall.complete );
}


....

jQuery.extend({
speed: function( speed, easing, fn ) {
var opt = speed && typeof speed === "object" ? jQuery.extend({}, speed) : {
complete: fn || !fn && easing ||
jQuery.isFunction( speed ) && speed,
duration: speed,
easing: fn && easing || easing && !jQuery.isFunction(easing) && easing
};

opt.duration = jQuery.fx.off ? 0 : typeof opt.duration === "number" ? opt.duration :
opt.duration in jQuery.fx.speeds ? jQuery.fx.speeds[opt.duration] : jQuery.fx.speeds._default;

// Queueing
opt.old = opt.complete;
opt.complete = function() {
if ( opt.queue !== false ) {
jQuery(this).dequeue();
}
if ( jQuery.isFunction( opt.old ) ) {
opt.old.call( this );
}
};

return opt;
},





It's interesting. I thought they would have a more sophisticated and abstract dispatch method. Something declarative and usable by plugins.

I was hoping to be able enumerate the different valid signature types but I have a feeling there isn't going to be some 'easily' exploitable pattern. The args aren't optional in a left to right manner such that for all optional args all args to the left have been optioned. You can call toggle with a callback function as the sole argument. Complicating things somewhat more is the fact that some arguments are described as accepting multiple types. Duration can be a number or string.

fadeTo Shenanigans


Yet the following works perfectly


$(this).fadeTo(function(){ console.log('weird')}, Math.random());

Not that it would be a good idea to rely upon implementation quirks


I'm just interested in automatically computing the different valid combinations of working signatures. Should be doable. Working left to right, taking the first combination of each signature. Cheating for a second and altering my routines to swap out things like "String|Number" to "String" I can see I'm on the right path



'toggle()': {'args': [], 'name': 'toggle', 'sig': [], 'types': []},

'toggle(Boolean)': {'args': ['showOrHide'],
'name': 'toggle',
'sig': [],
'types': ['Boolean']},
'toggle(Callback)': {'args': ['callback'],
'name': 'toggle',
'sig': [],
'types': ['Callback']},
'toggle(Function, Function)': {'args': ['handler(eventObject)',
'handler(eventObject)'],
'name': 'toggle',
'sig': [],
'types': ['Function', 'Function']},
'toggle(Function, Function, Function)': {'args': ['handler(eventObject)',
'handler(eventObject)',
'handler(eventObject)'],
'name': 'toggle',
'sig': [],
'types': ['Function',
'Function',
'Function']},
'toggle(String)': {'args': ['duration'],
'name': 'toggle',
'sig': [],
'types': ['String']},
'toggle(String, Callback)': {'args': ['duration', 'callback'],
'name': 'toggle',
'sig': [],
'types': ['String', 'Callback']},
'toggle(String, String)': {'args': ['duration', 'easing'],
'name': 'toggle',
'sig': [],
'types': ['String', 'String']},
'toggle(String, String, Callback)': {'args': ['duration',
'easing',
'callback'],
'name': 'toggle',
'sig': [],
'types': ['String',
'String',
'Callback']},

I could make a special class for the the types currently represented by a string sourced directly from the xml. This could compare "String|Number" == "Number" as true. Then tuples could be compared ('String|Number', 'Callback') == ('Number', 'Callback')

5 minutes of fame

Don't really know how much you can really read into an online test, but seeing I did well I'll not criticize it too much :) Not bad after such a big break.


The pay on oDesk is pretty lousy but it's all relative. If you earned 10 dollars an hour and lived in Cambodia? You can live in a resort on the beach for 500 a month there and it gets a lot cheaper than that ...

Brushing Up

Some eBooks I have been, am currently, and plan to be, reading

  • jQuery in Action
  • JavaScript - The Definitive Guide - 5th Edition
  • PacktPub.Firebug.1.5.Apr.2010
  • Rapid GUI Programming with Python and Qt - The Definitive Guide to PyQt Programming
  • OReilly.CSS.The.Missing.Manual.2nd.Edition.Sep.2009
  • Selenium 1.0 Testing Tools Beginner’s Guide
  • JavaScript Testing Beginner's Guide 2010 Packt
  • SQL Pocket Guide, Third Edition
  • Pro JavaScript with MooTools - Learning Advanced JavaScript Programming

SRS, jQuery

Back

My dear readers will be glad to know that I'm back. Where the hell have I been I imagine them asking ( I must have a great imagination, considering noone reads this blog! )

For the last year or so I've been studying music, one of the great unrequited loves in my life. I realised, or shall I say, I came to an even deeper understanding of, how profoundly untalented a musician I am. This is not to say that I don't believe I have potential but I'm positive it will take a Bruce Lee like zeal to unleash and heretofore unseen training methods. I haven't given up. I'm just taking a timeout to develop more programming skills to apply to the problem.

Throughout the year struggling away at ear training, using the available software I found myself yearning for something better. Being one of those arrogant "I could do this better" programmer types cursed also with a low pain threshold ("I have to use a mouse?? kill me!") ideas started to form in my mind. I didn't have the time or inclination to work on any of them however as I was busy studying and pretty much burned out on anything IT related.

But that's never here nor there. The central topic of this blogpost is 'scheduled repetition software'. Why lead in with your mediocre musicians lament?

Well, if you ever *completely stopped* something for any length of time you'll know the danger of gaining rust and losing insights. In the Skydiving world they call this state 'uncurrent' and you are considered a danger to yourself and others if you go for longer than a few months without jumping. We've all heard the "Use it or lose it" maxim.

Having finished up the year of study, I realised I needed some work and decided I should brush up on some of my old programming skills. Ouch! I've been somewhat blessed with a pretty good long term memory and retention. However, there's so much I just plain forgot.

Little details that before were readily available were now lost leaving me no recourse but to use references. "Where do I import that module from again?" "How do you get hg to automatically update server side?" "What was the keybinding for that command?" "The trigger for that snippet?" "What was the difference between left outer joins and ... "

And so on.

At one point during the year, I actually memorized 72 ascending musical intervals over four or five days. You think I could recall them a month later? No. Cramming doesn't work. You need to review. I can recall the intervals in the key of C and some in G/D/E but most I've forgotten.

Now I was considering studying this year also. It occured to me that it would be hard to maintain a decent skill/knowledge level to be able to work efficiently enough that I wouldn't be better off just digging ditches.

I knew it would be futile to try and cram in a heap of study over the holidays.


  
TODO:
Brushup on WebDev

jQuery in action.
CSS, The Missing Manual.
Firebug, Beginners Guide
SQLAlchemy tutorial
Genshi templating

I've done cramming in the past with things like Django (for an ill fated project ...), and you invariably forget it unless you use it day to day until it's burned into your long term memory.

How to bypass the need for a long period of 'day to day'? How to get something burned into your memory 'unnaturally'?

One thing you learn as a musician, is that you really need to apply 'deliberate practice' to get better. I'm not sure it's something many programmers do. At least I know I never did. In fact even when 'current' there were somethings I routinely used a reference for, even as little as a week between, just because my mind was habitually forgetful.

I left school at 12 years of age, obviously never attending university. Needless to say my study skills aren't very sophisticated. I actually find this thought rather encouraging as I know I'm miles from having plateued. There's lots of "low hanging fruit" ready to be picked.

So I had 6 weeks until school started again (at this point I planned on returning to study). I realised I should probably read into some study tips.

I downloaded a bootleg copy of the Memletics manual, which goes into quite some depth about the most efficent methods for learning.


  • Nutrition.
  • Exercise.
  • Attitude.
  • SRS.

I've still yet to apply most of what I read in there. If you think turning short term memory into long term memory is hard try developing new habits!

The biggest change in my learning habits is the use of SRS software. In fact I even got in some daily revisions on XMAS day.


Enter Anki


Anki is an implementation ( PyQt ) of software based flashcards. What makes this different to normal flashcards is that it intelligently schedules your cards for repetition at a time when it thinks you'll forget. I guess by now you have gleaned that SRS is an acronym for Scheduled Repetition Software.

The cards are html with all that entails: you can embed sounds/images and hyperlink. It has some other cool features like 'download shared deck'.

In fact, I found that someone had created a jQuery deck with roughly 180 cards.

Now, as I've learned there's something of an art to creating flashcards. You really need to reduce the amount of information to one or two discrete chunks.

This of course violates your prejudices against redundancy. IIRC, and I'm pretty sure I do (at least I remember this!), there was a meme floating about by the name of DRY, "Don't repeat yourself". With flashcards, scheduled REPETITION ones at that, the whole point is for repetition so that's OK.

As Anki is implemented in Python/PyQt and is extensible via plugins, one of the first things I did was to create a PYRO based bridge between it and my python extensible editor.

I'd copy/paste text from pdfs websites into a text buffer, manipulating the information into Q/A form then push 10 - 20 cards at a time using the multiple selection capabilites of my editor.

This information reorganizing process itself is invaluable as it really forces you to go beyond habitual skimming.

This requires some time and discipline and feels like a lot of work so I'd imagine it's not something everyone will have the stomach for. However, 'slow is fast'. What's the alternative? If you spend a few hours skimming and within a few weeks can recall nothing of value you've wasted your time. You have to think long term.

If it's something you think you'll eventually want/need to know then it's a good idea to spend the time upfront. How many times have you wasted countless hours fucking about blind, cause of artificial deadline pressures? I know people who have been developing for ages who always 'too busy' to RTFM.

Software development is complex. How can what is basically glorified rote memorisation really help?

How can you make critical implementation decisions if you don't know the frameworks you are using? How many times have you reinvented the wheel, because of lack of knowing your options? How much more creative can you be if you really know your 'material' and can think clearly about possibilities? It's what makes an expert and expert.

I applied Anki to brushing up on CSS, jQuery, Firebug and am learning the PyQt framework. At the rate I crammed some of it in, I doubt I'd have retained much without Anki.


jQuery Examples


One of my projects I've been tinkering on lately is a flash card export of the jQuery API raw xml dump. I wish to create flash cards going from {short_description : method_name} and its inverse {method_name, short_description}

Not only that but I want each html card to contain links which open up the example code in my editor. Luckily my editor can run commands argumented via the command line. I created a custom sblm:// protocol which allows commands to be sent in the query string.

I created a routine to determine all valid combinations of arguments so I can memorise the signatures of routines and create snippets. ( I'll have to make it a little smarter to filter the conflicting variations somehow. There's only so many ways a function can interpret a single boolean for instance )


    def arg_combinations(args):
args_n = (int(bool(e.get('optional', False))) for e in args)

for n in xrange(2 ** sum(args_n)):
opt_args = []
n_optional = -1

for arg in args:
optional = arg.get('optional')
if optional: n_optional += 1

if not optional or optional and n & (2 ** n_optional):
opt_args.append(arg)

yield opt_args

 'toggle()': {'args': [], 'name': 'toggle', 'sig': [<signature>], 'types': []},

'toggle(callback)': {'args': ['callback'],
'name': 'toggle',
'sig': [<signature>],
'types': ['Callback']},
'toggle(duration)': {'args': ['duration'],
'name': 'toggle',
'sig': [<signature>],
'types': ['String|Number']},
'toggle(duration, callback)': {'args': ['duration', 'callback'],
'name': 'toggle',
'sig': [<signature>],
'types': ['String|Number', 'Callback']},
'toggle(duration, easing)': {'args': ['duration', 'easing'],
'name': 'toggle',
'sig': [<signature>],
'types': ['String|Number', 'String']},
'toggle(duration, easing, callback)': {'args': ['duration',
'easing',
'callback'],
'name': 'toggle',
'sig': [<signature>],
'types': ['String|Number',
'String',
'Callback']},
'toggle(easing)': {'args': ['easing'],
'name': 'toggle',
'sig': [<signature>],
'types': ['String']},
'toggle(easing, callback)': {'args': ['easing', 'callback'],
'name': 'toggle',
'sig': [<signature>],
'types': ['String', 'Callback']},

I wonder if you really can pass a single string as an arg and have the toggle function determine whether you meant duration or easing? I'll have to try it our or better yet, look at the jQuery soruce

For purposes of automatically creating snippets I guess you really only want to look at the different signature types. jQuery uses anonymous functions a lot and literal string based arguments so creating snippets with placeholder args already quoted and function(){} inline seems like a decent idea


Web Snippets


There's a reason iterative development is usurping upfront design methodologies. We are fallible and mostly lacking detailed imagination. Our most glaring and atypically ungrudging admission: the fact we are so obsessed with testing.

I'm a great believer in efficient text editing. This shortens your feedback loop. If you can input text twice as fast you get through, I dunno, probably not twice as many, but a lot more 'fuckups'. Iterations!

The thing about snippets I've found is that unless I can recall instantly the abbreviation, I'll tend to just type in the whole words.

Now this really ties back to what I was saying before about 'deliberate practice'

In theory, if you practiced enough you could use completely arbitrary 2 letter combinations to compress an API worth of words down. However, memorising, even with SRS, is easier with some mnemonics so...

I've actually written some routines which will take a list of words and attempt to create ideal abbreviations and ensure unique.

One of the routines uses the Carnegie Mellon University Pronouncing Dictionary, cmudict for short, found in the nltk toolkit.


    def rank_letters(wd):
if not wd in CMU: return []

cmu = ''.join(['_' if ch[-1].isdigit() else ch for ch in CMU[wd][0]])
cmus = defaultdict(int)

mapping = '^(\\w)_|_(\\w)_|(\\w)_|_(\\w)$|_(\\w)'

for i, w in enumerate(mapping.split('|')):
for m in re.finditer(w, cmu):
ch = m.group(1).lower()
if ch in wd: cmus[ch] = max(i+1, cmus[ch])

return sorted(cmus.keys(), key=lambda k: (cmus[k]))

Intuitively I sensed that words surrounded by vowel sounds are the 'strongest' letters in a word, followed by those adjacent to only one. It seems to work reasonably well for such a crude method.


eg
document - DocuMeNt - dm
border - BorDer - bd

I also used the words corpus to create routines to decompose composite words.


mouseup - mouse, up - mu
mousedown - mouse, down - md

I'm going to create desc/entity flashcards and snippets also for html elements/attributes and css property/values.

With my editor at least, the bindings are smart enough to know when the cursor is inside a tag. eg. <|div> Why the hell type in annoying long character sequences when the set of identifiers compresses down to one, two, maybe three (ouch) character ids?

Over a course of weeks, you could, with SRS, systematically memorise the entire html set of elements and their attributes along with related snippet abbreviations. Shld I sy abrv instd?

It would be analogous to touchtyping. A lifetime of benefit for a short period of practice.

On second thought ... A lifetime of entering html/css? There's a horrid thought...


Statistics


Ideally I'd like to give the most frequently used entities the most suitable and short abbreviations, one character ones at times.

Google apparently did a massive study looking at web authoring statistics. They have the data as to which elements should receive the honorary 1 keyers.


Namespaces


You mean there's other javascript snippet candidates other than jQuery? Poses some problems. Where there's a problem, there's a solution.