string name of variable(object) [duplicate] - c#

I would like to be able to get the name of a variable as a string but I don't know if Python has that much introspection capabilities. Something like:
>>> print(my_var.__name__)
'my_var'
I want to do that because I have a bunch of variables I'd like to turn into a dictionary like :
bar = True
foo = False
>>> my_dict = dict(bar=bar, foo=foo)
>>> print my_dict
{'foo': False, 'bar': True}
But I'd like something more automatic than that.
Python have locals() and vars(), so I guess there is a way.

As unwind said, this isn't really something you do in Python - variables are actually name mappings to objects.
However, here's one way to try and do it:
>>> a = 1
>>> for k, v in list(locals().iteritems()):
if v is a:
a_as_str = k
>>> a_as_str
a
>>> type(a_as_str)
'str'

I've wanted to do this quite a lot. This hack is very similar to rlotun's suggestion, but it's a one-liner, which is important to me:
blah = 1
blah_name = [ k for k,v in locals().iteritems() if v is blah][0]
Python 3+
blah = 1
blah_name = [ k for k,v in locals().items() if v is blah][0]

Are you trying to do this?
dict( (name,eval(name)) for name in ['some','list','of','vars'] )
Example
>>> some= 1
>>> list= 2
>>> of= 3
>>> vars= 4
>>> dict( (name,eval(name)) for name in ['some','list','of','vars'] )
{'list': 2, 'some': 1, 'vars': 4, 'of': 3}

This is a hack. It will not work on all Python implementations distributions (in particular, those that do not have traceback.extract_stack.)
import traceback
def make_dict(*expr):
(filename,line_number,function_name,text)=traceback.extract_stack()[-2]
begin=text.find('make_dict(')+len('make_dict(')
end=text.find(')',begin)
text=[name.strip() for name in text[begin:end].split(',')]
return dict(zip(text,expr))
bar=True
foo=False
print(make_dict(bar,foo))
# {'foo': False, 'bar': True}
Note that this hack is fragile:
make_dict(bar,
foo)
(calling make_dict on 2 lines) will not work.
Instead of trying to generate the dict out of the values foo and bar,
it would be much more Pythonic to generate the dict out of the string variable names 'foo' and 'bar':
dict([(name,locals()[name]) for name in ('foo','bar')])

This is not possible in Python, which really doesn't have "variables". Python has names, and there can be more than one name for the same object.

I think my problem will help illustrate why this question is useful, and it may give a bit more insight into how to answer it. I wrote a small function to do a quick inline head check on various variables in my code. Basically, it lists the variable name, data type, size, and other attributes, so I can quickly catch any mistakes I've made. The code is simple:
def details(val):
vn = val.__name__ # If such a thing existed
vs = str(val)
print("The Value of "+ str(vn) + " is " + vs)
print("The data type of " + vn + " is " + str(type(val)))
So if you have some complicated dictionary / list / tuple situation, it would be quite helpful to have the interpreter return the variable name you assigned. For instance, here is a weird dictionary:
m = 'abracadabra'
mm=[]
for n in m:
mm.append(n)
mydic = {'first':(0,1,2,3,4,5,6),'second':mm,'third':np.arange(0.,10)}
details(mydic)
The Value of mydic is {'second': ['a', 'b', 'r', 'a', 'c', 'a', 'd', 'a', 'b', 'r', 'a'], 'third': array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]), 'first': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}
The data type of mydic is <type 'dict'>
details(mydic['first'])
The Value of mydic['first'] is (0, 1, 2, 3, 4, 5, 6)]
The data type of mydic['first'] is <type 'list'>
details(mydic.keys())
The Value of mydic.keys() is ['second', 'third', 'first']
The data type of mydic.keys() is <type 'tuple'>
details(mydic['second'][0])
The Value of mydic['second'][0] is a
The data type of mydic['second'][0] is <type 'str'>
I'm not sure if I put this in the right place, but I thought it might help. I hope it does.

I wrote a neat little useful function based on the answer to this question. I'm putting it here in case it's useful.
def what(obj, callingLocals=locals()):
"""
quick function to print name of input and value.
If not for the default-Valued callingLocals, the function would always
get the name as "obj", which is not what I want.
"""
for k, v in list(callingLocals.items()):
if v is obj:
name = k
print(name, "=", obj)
usage:
>> a = 4
>> what(a)
a = 4
>>|

I find that if you already have a specific list of values, that the way described by #S. Lotts is the best; however, the way described below works well to get all variables and Classes added throughout the code WITHOUT the need to provide variable name though you can specify them if you want. Code can be extend to exclude Classes.
import types
import math # mainly showing that you could import what you will before d
# Everything after this counts
d = dict(globals())
def kv_test(k,v):
return (k not in d and
k not in ['d','args'] and
type(v) is not types.FunctionType)
def magic_print(*args):
if len(args) == 0:
return {k:v for k,v in globals().iteritems() if kv_test(k,v)}
else:
return {k:v for k,v in magic_print().iteritems() if k in args}
if __name__ == '__main__':
foo = 1
bar = 2
baz = 3
print magic_print()
print magic_print('foo')
print magic_print('foo','bar')
Output:
{'baz': 3, 'foo': 1, 'bar': 2}
{'foo': 1}
{'foo': 1, 'bar': 2}

In python 3 this is easy
myVariable = 5
for v in locals():
if id(v) == id("myVariable"):
print(v, locals()[v])
this will print:
myVariable 5

Python3. Use inspect to capture the calling local namespace then use ideas presented here. Can return more than one answer as has been pointed out.
def varname(var):
import inspect
frame = inspect.currentframe()
var_id = id(var)
for name in frame.f_back.f_locals.keys():
try:
if id(eval(name)) == var_id:
return(name)
except:
pass

Here's the function I created to read the variable names. It's more general and can be used in different applications:
def get_variable_name(*variable):
'''gets string of variable name
inputs
variable (str)
returns
string
'''
if len(variable) != 1:
raise Exception('len of variables inputed must be 1')
try:
return [k for k, v in locals().items() if v is variable[0]][0]
except:
return [k for k, v in globals().items() if v is variable[0]][0]
To use it in the specified question:
>>> foo = False
>>> bar = True
>>> my_dict = {get_variable_name(foo):foo,
get_variable_name(bar):bar}
>>> my_dict
{'bar': True, 'foo': False}

In reading the thread, I saw an awful lot of friction. It's easy enough to give
a bad answer, then let someone give the correct answer. Anyway, here is what I found.
From: [effbot.org] (http://effbot.org/zone/python-objects.htm#names)
The names are a bit different — they’re not really properties of the object, and the object itself doesn't know what it’s called.
An object can have any number of names, or no name at all.
Names live in namespaces (such as a module namespace, an instance namespace, a function’s local namespace).
Note: that it says the object itself doesn’t know what it’s called, so that was the clue. Python objects are not self-referential. Then it says, Names live in namespaces. We have this in TCL/TK. So maybe my answer will help (but it did help me)
jj = 123
print eval("'" + str(id(jj)) + "'")
print dir()
166707048
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'jj']
So there is 'jj' at the end of the list.
Rewrite the code as:
jj = 123
print eval("'" + str(id(jj)) + "'")
for x in dir():
print id(eval(x))
161922920
['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'jj']
3077447796
136515736
3077408320
3077656800
136515736
161922920
This nasty bit of code id's the name of variable/object/whatever-you-pedantics-call-it.
So, there it is. The memory address of 'jj' is the same when we look for it directly, as when we do the dictionary look up in global name space. I'm sure you can make a function to do this. Just remember which namespace your variable/object/wypci is in.
QED.

I wrote the package sorcery to do this kind of magic robustly. You can write:
from sorcery import dict_of
my_dict = dict_of(foo, bar)

Maybe I'm overthinking this but..
str_l = next((k for k,v in locals().items() if id(l) == id(v)))
>>> bar = True
>>> foo = False
>>> my_dict=dict(bar=bar, foo=foo)
>>> next((k for k,v in locals().items() if id(bar) == id(v)))
'bar'
>>> next((k for k,v in locals().items() if id(foo) == id(v)))
'foo'
>>> next((k for k,v in locals().items() if id(my_dict) == id(v)))
'my_dict'

import re
import traceback
pattren = re.compile(r'[\W+\w+]*get_variable_name\((\w+)\)')
def get_variable_name(x):
return pattren.match( traceback.extract_stack(limit=2)[0][3]) .group(1)
a = 1
b = a
c = b
print get_variable_name(a)
print get_variable_name(b)
print get_variable_name(c)

I uploaded a solution to pypi. It's a module defining an equivalent of C#'s nameof function.
It iterates through bytecode instructions for the frame its called in, getting the names of variables/attributes passed to it. The names are found in the .argrepr of LOAD instructions following the function's name.

Most objects don't have a __name__ attribute. (Classes, functions, and modules do; any more builtin types that have one?)
What else would you expect for print(my_var.__name__) other than print("my_var")? Can you simply use the string directly?
You could "slice" a dict:
def dict_slice(D, keys, default=None):
return dict((k, D.get(k, default)) for k in keys)
print dict_slice(locals(), ["foo", "bar"])
# or use set literal syntax if you have a recent enough version:
print dict_slice(locals(), {"foo", "bar"})
Alternatively:
throw = object() # sentinel
def dict_slice(D, keys, default=throw):
def get(k):
v = D.get(k, throw)
if v is not throw:
return v
if default is throw:
raise KeyError(k)
return default
return dict((k, get(k)) for k in keys)

Well, I encountered the very same need a few days ago and had to get a variable's name which was pointing to the object itself.
And why was it so necessary?
In short I was building a plug-in for Maya. The core plug-in was built using C++ but the GUI is drawn through Python(as its not processor intensive). Since I, as yet, don't know how to return multiple values from the plug-in except the default MStatus, therefore to update a dictionary in Python I had to pass the the name of the variable, pointing to the object implementing the GUI and which contained the dictionary itself, to the plug-in and then use the MGlobal::executePythonCommand() to update the dictionary from the global scope of Maya.
To do that what I did was something like:
import time
class foo(bar):
def __init__(self):
super(foo, self).__init__()
self.time = time.time() #almost guaranteed to be unique on a single computer
def name(self):
g = globals()
for x in g:
if isinstance(g[x], type(self)):
if g[x].time == self.time:
return x
#or you could:
#return filter(None,[x if g[x].time == self.time else None for x in g if isinstance(g[x], type(self))])
#and return all keys pointing to object itself
I know that it is not the perfect solution in in the globals many keys could be pointing to the same object e.g.:
a = foo()
b = a
b.name()
>>>b
or
>>>a
and that the approach isn't thread-safe. Correct me if I am wrong.
At least this approach solved my problem by getting the name of any variable in the global scope which pointed to the object itself and pass it over to the plug-in, as argument, for it use internally.
I tried this on int (the primitive integer class) but the problem is that these primitive classes don't get bypassed (please correct the technical terminology used if its wrong). You could re-implement int and then do int = foo but a = 3 will never be an object of foo but of the primitive. To overcome that you have to a = foo(3) to get a.name() to work.

With python 2.7 and newer there is also dictionary comprehension which makes it a bit shorter. If possible I would use getattr instead eval (eval is evil) like in the top answer. Self can be any object which has the context your a looking at. It can be an object or locals=locals() etc.
{name: getattr(self, name) for name in ['some', 'vars', 'here]}

I was working on a similar problem. #S.Lott said "If you have the list of variables, what's the point of "discovering" their names?" And my answer is just to see if it could be done and if for some reason you want to sort your variables by type into lists. So anyways, in my research I came came across this thread and my solution is a bit expanded and is based on #rlotun solution. One other thing, #unutbu said, "This idea has merit, but note that if two variable names reference the same value (e.g. True), then an unintended variable name might be returned." In this exercise that was true so I dealt with it by using a list comprehension similar to this for each possibility: isClass = [i for i in isClass if i != 'item']. Without it "item" would show up in each list.
__metaclass__ = type
from types import *
class Class_1: pass
class Class_2: pass
list_1 = [1, 2, 3]
list_2 = ['dog', 'cat', 'bird']
tuple_1 = ('one', 'two', 'three')
tuple_2 = (1000, 2000, 3000)
dict_1 = {'one': 1, 'two': 2, 'three': 3}
dict_2 = {'dog': 'collie', 'cat': 'calico', 'bird': 'robin'}
x = 23
y = 29
pie = 3.14159
eee = 2.71828
house = 'single story'
cabin = 'cozy'
isClass = []; isList = []; isTuple = []; isDict = []; isInt = []; isFloat = []; isString = []; other = []
mixedDataTypes = [Class_1, list_1, tuple_1, dict_1, x, pie, house, Class_2, list_2, tuple_2, dict_2, y, eee, cabin]
print '\nMIXED_DATA_TYPES total count:', len(mixedDataTypes)
for item in mixedDataTypes:
try:
# if isinstance(item, ClassType): # use this for old class types (before 3.0)
if isinstance(item, type):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isClass.append(mapping_as_str)
isClass = [i for i in isClass if i != 'item']
elif isinstance(item, ListType):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isList.append(mapping_as_str)
isList = [i for i in isList if i != 'item']
elif isinstance(item, TupleType):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isTuple.append(mapping_as_str)
isTuple = [i for i in isTuple if i != 'item']
elif isinstance(item, DictType):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isDict.append(mapping_as_str)
isDict = [i for i in isDict if i != 'item']
elif isinstance(item, IntType):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isInt.append(mapping_as_str)
isInt = [i for i in isInt if i != 'item']
elif isinstance(item, FloatType):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isFloat.append(mapping_as_str)
isFloat = [i for i in isFloat if i != 'item']
elif isinstance(item, StringType):
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
isString.append(mapping_as_str)
isString = [i for i in isString if i != 'item']
else:
for k, v in list(locals().iteritems()):
if v is item:
mapping_as_str = k
other.append(mapping_as_str)
other = [i for i in other if i != 'item']
except (TypeError, AttributeError), e:
print e
print '\n isClass:', len(isClass), isClass
print ' isList:', len(isList), isList
print ' isTuple:', len(isTuple), isTuple
print ' isDict:', len(isDict), isDict
print ' isInt:', len(isInt), isInt
print ' isFloat:', len(isFloat), isFloat
print 'isString:', len(isString), isString
print ' other:', len(other), other
# my output and the output I wanted
'''
MIXED_DATA_TYPES total count: 14
isClass: 2 ['Class_1', 'Class_2']
isList: 2 ['list_1', 'list_2']
isTuple: 2 ['tuple_1', 'tuple_2']
isDict: 2 ['dict_1', 'dict_2']
isInt: 2 ['x', 'y']
isFloat: 2 ['pie', 'eee']
isString: 2 ['house', 'cabin']
other: 0 []
'''

you can use easydict
>>> from easydict import EasyDict as edict
>>> d = edict({'foo':3, 'bar':{'x':1, 'y':2}})
>>> d.foo
3
>>> d.bar.x
1
>>> d = edict(foo=3)
>>> d.foo
3
another example:
>>> d = EasyDict(log=False)
>>> d.debug = True
>>> d.items()
[('debug', True), ('log', False)]

On python3, this function will get the outer most name in the stack:
import inspect
def retrieve_name(var):
"""
Gets the name of var. Does it from the out most frame inner-wards.
:param var: variable to get name from.
:return: string
"""
for fi in reversed(inspect.stack()):
names = [var_name for var_name, var_val in fi.frame.f_locals.items() if var_val is var]
if len(names) > 0:
return names[0]
It is useful anywhere on the code. Traverses the reversed stack looking for the first match.

While this is probably an awful idea, it is along the same lines as rlotun's answer but it'll return the correct result more often.
import inspect
def getVarName(getvar):
frame = inspect.currentframe()
callerLocals = frame.f_back.f_locals
for k, v in list(callerLocals.items()):
if v is getvar():
callerLocals.pop(k)
try:
getvar()
callerLocals[k] = v
except NameError:
callerLocals[k] = v
del frame
return k
del frame
You call it like this:
bar = True
foo = False
bean = False
fooName = getVarName(lambda: foo)
print(fooName) # prints "foo"

should get list then return
def get_var_name(**kwargs):
"""get variable name
get_var_name(var = var)
Returns:
[str] -- var name
"""
return list(kwargs.keys())[0]

It will not return the name of variable but you can create dictionary from global variable easily.
class CustomDict(dict):
def __add__(self, other):
return CustomDict({**self, **other})
class GlobalBase(type):
def __getattr__(cls, key):
return CustomDict({key: globals()[key]})
def __getitem__(cls, keys):
return CustomDict({key: globals()[key] for key in keys})
class G(metaclass=GlobalBase):
pass
x, y, z = 0, 1, 2
print('method 1:', G['x', 'y', 'z']) # Outcome: method 1: {'x': 0, 'y': 1, 'z': 2}
print('method 2:', G.x + G.y + G.z) # Outcome: method 2: {'x': 0, 'y': 1, 'z': 2}

With python-varname you can easily do it:
pip install python-varname
from varname import Wrapper
foo = Wrapper(True)
bar = Wrapper(False)
your_dict = {val.name: val.value for val in (foo, bar)}
print(your_dict)
# {'foo': True, 'bar': False}
Disclaimer: I'm the author of that python-varname library.

>>> a = 1
>>> b = 1
>>> id(a)
34120408
>>> id(b)
34120408
>>> a is b
True
>>> id(a) == id(b)
True
this way get varname for a maybe 'a' or 'b'.

Related

Conventional name for matrix with named values? C# dataframe

I'd like to implement something in C# where I could define a multidimensional array and index accordingly or get a specific value by key. Is there a conventional name for this? Or a package for it?
I've implemented in python, but I did a bunch of method overriding to make it work conventionally. There, usage looks like this:
class NamedTicTacToeBoard(VariableMatrix):
def __init__(self):
VariableMatrix.__init__(self)
self.__shape__ = (3, 3)
self.A1 = X
self.A2 = O
self.A3 = X
# and so forth...
board = NamedTicTacToeBoard()
board[0, :]
>> [X, O, X]
board.A1
>> X
Thanks for the help
Edit: I'm not actually making a TicTacToe board, it's for a GNC, so I need to do a bunch of matrix algebra, but also reference the states.

Running number of items in subgroups within ienumerable of items

Say I have an
IEnumerable< IEnumerable< string > > rowsOfTextColumns
The inner ienumerable string values represent columns in a row, thus the outer ienumerable stores several rows of text columns.
Like: 3 rows by 4 columns:
12345 foo 2014-10-16 09:55 blah
12345 foo 2014-10-16 09:55 bleh
67890 bar 2014-10-16 09:58 ugh
The DateTime column values are not unique - as you can see in the example, several entries at the same time are possible. But datetime makes most sense to use as ID in my data.
Since I want a unique ID for each row, I would like to add a column to each row "on the fly", which contains the number of occurence from entries with same datetime, starting with 1. Like this:
12345 foo 2014-10-16 09:55 blah (1)
12345 foo 2014-10-16 09:55 bleh (2)
67890 bar 2014-10-16 10:21 ugh (1)
(For clarification: the unique id would be a compound of datetime + running number within datetime subgroup)
Sure I know how to do this some way.
But - how is this done most elegantly, e.g. using LINQ / functional programming aspects of C#?
Furthermore I am curious, how would the same be done most elegantly in F#?
EDIT #1: better illustrated the source data format
EDIT #2:
Allright, using groupby as suggested in one comment, I got this so far (in C#, look at my selected Answer for F# code):
var groupsByDatetime = rowsOfColumns.GroupBy( rec => rec.ElementAt(2) );
var extendedRows =
groupsByDatetime.SelectMany( g =>
g.Select( (columns,i) =>
columns.Concat( new[]{(1+i).ToString()} ) ) );
Anyone bids less? :)
Well doesn't look too bad already I guess.
This groups the items and maps each item to include its index within the group.
let groupAndIndexItems keySelector =
Seq.groupBy keySelector
>> Seq.map (fun (key, items) ->
let indexedItems = items |> Seq.mapi (fun i x -> x, i)
key, indexedItems
)
Example usage:
[
12345, "foo", "2014-10-16 09:55", "blah"
12345, "foo", "2014-10-16 09:55", "bleh"
67890, "bar", "2014-10-16 09:58", "ugh"
]
|> groupAndIndexItems (fun (_, _, s, _) -> s)
Output:
val it : seq<string * seq<(int * string * string * string) * int>> =
seq
[("2014-10-16 09:55",
seq [((12345, "foo", "2014-10-16 09:55", "blah"), 0);
((12345, "foo", "2014-10-16 09:55", "bleh"), 1)]);
("2014-10-16 09:58",
seq [((67890, "bar", "2014-10-16 09:58", "ugh"), 0)])]

How are ambiguous enum values resolved in C#?

I checked the section of the C# language specification regarding enums, but was unable to explain the output for the following code:
enum en {
a = 1, b = 1, c = 1,
d = 2, e = 2, f = 2,
g = 3, h = 3, i = 3,
j = 4, k = 4, l = 4
}
en[] list = new en[] {
en.a, en.b, en.c,
en.d, en.e, en.f,
en.g, en.h, en.i,
en.j, en.k, en.l
};
foreach (en ele in list) {
Console.WriteLine("{1}: {0}", (int)ele, ele);
}
It outputs:
c: 1
c: 1
c: 1
d: 2
d: 2
d: 2
g: 3
g: 3
g: 3
k: 4
k: 4
k: 4
Now, why would it select the third "1", the first "2" and "3", but the second "4"? Is this undefined behavior, or am I missing something obvious?
This is specifically documented to be undocumented behaviour.
There is probably something in the way the code is written that will end up picking the same thing every time but the documentation of Enum.ToString states this:
If multiple enumeration members have the same underlying value and you attempt to retrieve the string representation of an enumeration member's name based on its underlying value, your code should not make any assumptions about which name the method will return.
(my emphasis)
As mentioned in a comment, a different .NET runtime might return different values, but the whole problem with undocumented behaviour is that it is prone to change for no (seemingly) good reason. It could change depending on the weather, the time, the mood of the programmer, or even in a hotfix to the .NET runtime. You cannot rely on undocumented behavior.
Note that in your example, ToString is exactly what you want to look at since you're printing the value, which will in turn convert it to a string.
If you try to do a comparison, all the enum values with the same underlying numerical value is equivalent and you cannot tell which one you stored in a variable in the first place.
In other words, if you do this:
var x = en.a;
there is no way to afterwards deduce that you wrote en.a and not en.b or en.c as they all compare equal, they all have the same underlying value. Well, short of creating a program that reads its own source.

Generate Combinations for a set of Numbers

In C#, I want to generate combinations for {1,2,3,4,5,6,7,8,9,0} in 5 digits. So, I want to get an output of 11111,11112, etc up to 99999.
When I searched I didn't get anything that could work when I threw it into a console application.
Everything always got an error with Combinations...
do a for loop and count from 11111 to 99999:
for(int i = 11111; i<=99999; i++){
var combination = i.ToString();
Console.WriteLine(combination);
}
or if you want 00001 to 99999
for (int i = 0; i <= 99999; i++)
{
var combination = String.Format("{0:D5}", i);
Console.WriteLine(combination);
}
Simply counting from 0 to 99999 will produce all combinations (and you really should start with 00000 if you want all combinations)
If you're looking for a way to combine numbers, not specifically to get a sequence, you can do a linq query for it.
var bob = new [] {1,2,3,4,5,6,7,8,9,0};
var greg =
from a in bob
from b in bob
from c in bob
from d in bob
from e in bob
select string.Concat(a, b, c, d, e);

Spelling Suggestor in ASP.NET

I need to build a spelling suggestor in ASP.NET... The below are my requirement.
Case 1: My list of words are not just englist words but will also includes some codes like AACD, ESSA, BIMER etc... I may provide such (New) words from Database.
Case 2: I also need a similar spelling suggestor for Non-English Language, Even here, I can provide a list of words from a Database.
Now, Any suggestions as to how I implement the same is welcome.
Further, I found the following Python Code, from a website, which states it returns the most probable suggestion (in english ofcourse). If someone can translate it into C# that would be really helpful.
import re, collections
def words(text): return re.findall('[a-z]+', text.lower())
def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model
NWORDS = train(words(file('big.txt').read()))
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def edits1(word):
s = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [a + b[1:] for a, b in s if b]
transposes = [a + b[1] + b[0] + b[2:] for a, b in s if len(b)>1]
replaces = [a + c + b[1:] for a, b in s for c in alphabet if b]
inserts = [a + c + b for a, b in s for c in alphabet]
return set(deletes + transposes + replaces + inserts)
def known_edits2(word):
return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)
def known(words): return set(w for w in words if w in NWORDS)
def correct(word):
candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
return max(candidates, key=NWORDS.get)
Thanks
- Raja
Another alternative is NHunspel
NHunspell is a free open source spell
checker for the .NET Framework. C# and
Visual Basic sample code is available
for spell checking, hyphenation and
sysnonym lookup via thesaurus.
using (Hunspell hunspell = new Hunspell("en_us.aff", "en_us.dic"))
{
bool correct = hunspell.Spell("Recommendation");
var suggestions = hunspell.Suggest("Recommendatio");
foreach (string suggestion in suggestions)
{
Console.WriteLine("Suggestion is: " + suggestion );
}
}
The commercial product I work on uses NETSpell Spell Checker, it has a dictionary tool that allows you to add custom dictionaries and words.
Free .NET spell checker based around a WPF text box that can be used client or server side can be seen here. This can be passed a list of words to ignore (your custom dictionary)
Full disclosure...written by yours truly with some help from stack overflow of course :)

Categories

Resources