R to Python List manipulation

trinker

ggplot2orBust
#1
Taking the plunge and learning Python at a deeper level to compliment my R skills. I am coming up against a wall with manipulating named lists in the way I can in R. R/Python folks can you tell me the equivalent ways to do this common R tasks?

R Way of working with list:

Code:
x <- list(
    Titles = c(
        "[jJ]r",
        "[sS]r"
    ),
    Entities = c(
        "[bB]ros",
        "[iI]nc",
        "[lL]td",
        "[cC]o",
        "[cC]orp",
        "[pP]lc"
    )
)

## Get list names
names(x)
## [1] "Titles"   "Entities"

## Get index by number
x[[1]]
## [1] "[jJ]r" "[sS]r"

## Get elements within subvectors within a list
x[[2]][3]
## [1] "[lL]td"

## Flatten a named list into an annamed vector
unlist(x, use.names = FALSE)
## [1] "[jJ]r"   "[sS]r"   "[bB]ros" "[iI]nc"  "[lL]td" 
## [6] "[cC]o"   "[cC]orp" "[pP]lc" 

## Make a lookup dictionar from the named list
mydict <- rep(names(x), lengths(x))
names(mydict) <- unlist(x, use.names = FALSE)

mydict["[jJ]r"]
##    [jJ]r 
## "Titles"
How can I do the same manipulations in Python? See my poor attempts below.

Python Way of working with list:

Code:
x = {
    "Titles": [
        "[jJ]r",
        "[sS]r"
    ],
    "Entities": [
        "[bB]ros",
        "[iI]nc",
        "[lL]td",
        "[cC]o",
        "[cC]orp",
        "[pP]lc"
    ]
}

## Get list names
x.names()
## Traceback (most recent call last):
## 
##   File "<ipython-input-51-2b90a2f5d75f>", line 1, in <module>
##    x.names()
## 
## AttributeError: 'dict' object has no attribute 'names'


## Get index by number
x[1]
## Traceback (most recent call last):
## 
##   File "<ipython-input-52-fb0e0080324e>", line 1, in <module>
##     x[1]
## 
## KeyError: 1

## Can do:
x["Titles"]
## but this doesn't help when a lsit with unknown names is passed

## Get elements within subvectors within a list
x[[2, 3]]
## Traceback (most recent call last):
## 
##   File "<ipython-input-54-94a22dca2f66>", line 1, in <module>
##     x[[2, 3]]
## 
## TypeError: unhashable type: 'list'


## Flatten a named list into an annamed vector
[item for sublist in x for item in sublist]
## ['T', 'i', 't', 'l', 'e', 's', 'E', 'n', 't', 'i', 't', 'i', 'e', 's']
## Not what I'm going for


## Make a lookup dictionar from the named list
 

Jake

Cookie Scientist
#2
First of all, that's a dictionary, not a list. You construct dictionaries with curly braces and lists with square brackets.

A dictionary maps keys to values. A dictionary's .keys() method will return, well, its keys: in this case the list names ['Titles', 'Entities']. The .values() method will return the two sublists. And the .items() method will return a list of (key, value) tuples. So that's how you can grab the contents when you don't know the keys in advance. Note also that you can use variables to refer to the keys, as in:
cool_key = 'Titles'
x[cool_key]

To grab elements of the sublists, use something like x["Titles"][0].

You can't index dictionaries by position, only by their keys.

If you want something you can index both by position and by key/name, you could look into a namedtuple. But if I were you I'd reconsider whether this is really something you need. IMO it's probably safer and less error-prone to index either by position or by key/name, but not both.
 

Jake

Cookie Scientist
#4
If (and only if) you're working with a dictionary where you know that every value is a list, then you could cast the dict values (discarding the keys) into a list of lists using `list(x.values())` and then apply one of the recipes here for flattening a list: https://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python

If you don't know for sure that all the dict values are lists, then you'd have to add additional logic to check and/or cast the individual values to lists before trying to flatten it using one of those recipes.