Регулярное выражение, чтобы поймать раздел между двумя цитатами

Кажется, я не могу правильно понять свое регулярное выражение, пытаясь уловить фразу между цитатами. Например. выделен жирным шрифтом (ПРИМЕЧАНИЕ: ввод имеет строки до и после):

— Я вполне могу понять, что вы так думаете. Я сказал. "Конечно, в вашем положении неофициального советника и помощника всех, кто абсолютно озадачен, на трех континентах, вы соприкасаетесь со всем странным и причудливым. Но здесь"

"Конечно, в вашем положении неофициального советника и помощника всех, кто совсем запутался, на трех континентах, вы соприкасаетесь со всем странным и причудливым. Но здесь" -- я взял утреннюю газету. с земли - "давайте проверим это на практике. Вот первый заголовок, на который я наткнулся. "Жестокость мужа по отношению к жене". Там полколонки печати, но я не читая знаю, что все это мне прекрасно знакомо.Есть, конечно, другая женщина, выпивка, толчок, удар, синяк, сочувствующая сестра или хозяйка. ... Самый грубый из писателей не мог бы придумать ничего более грубого».

Я пытался получить текст до и после кавычек, но не могу получить желаемый результат. Должен быть какой-то способ сгруппировать регулярное выражение, чтобы я мог поймать строку между кавычками, а также две окружающие кавычки.

Пытался:

import re

def get_quotes(paragraph):
    quote_rx = r'''([""])(?:(?=(\\?))\2.)*?\1'''
    return [i.group(0) for i in \
           re.finditer(quote_rx, paragraph, re.S)]

def get_said(paragraph, quote):
    quote_start = paragraph.index(quote)
    quote_end = quote_start + len(quote)
    before = paragraph[:quote_start]
    after = paragraph[quote_end:]
    return before, after


paragraphs = ['''I smiled and shook my head. "I can quite understand your thinking so." I said. "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"--I picked up the morning paper from the ground--"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."''', 
'''Such was the remarkable narrative to which I listened on that April evening -- a narrative which would have been utterly incredible to me had it not been confirmed by the actual sight of the tall, spare figure and the keen, eager face, which I had never thought to see again. In some manner he had learned of my own sad bereavement, and his sympathy was shown in his manner rather than in his words. "Work is the best antidote to sorrow, my dear Watson," said he, "and I have a piece of work for us both to-night which, if we can bring it to a successful conclusion, will in itself justify a man's life on this planet." In vain I begged him to tell me more. "You will hear and see enough before morning," he answered. "We have three years of the past to discuss. Let that suffice until half-past nine, when we start upon the notable adventure of the empty house."''']

for p in paragraphs:
    saids = set()
    for i in get_quotes(p):
        b,a = get_said(p,i)
        print b
        print a
        print

Желаемый результат:

in-btw: I said.
quotes: ["I can quite understand your thinking so.","Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"]
section: "I can quite understand your thinking so." **I said.** "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"


in-btw: --I picked up the morning paper from the ground--
quotes: ['''"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"''', '''"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."''']
section: "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"**--I picked up the morning paper from the ground--**"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."

person alvas    schedule 21.10.2013    source источник
comment
([^"]*"[^"]*")+ должно работать (при условии, что вы начинаете за пределами кавычек). [^"]* идет снаружи, "[^"]*" внутри.   -  person Danstahr    schedule 21.10.2013
comment
+1 показывает нам, что вы пробовали, и желаемый результат.   -  person Games Brainiac    schedule 21.10.2013


Ответы (1)


Это очень просто, нужно регулярное выражение r'^("[^"]+")([^"]+)("[^"]+")':

import re

s = """
"I can quite understand your thinking so." I said. "Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"

"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"--I picked up the morning paper from the ground--"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."
"""

for segment in s.splitlines():
    if not segment:
        continue
    first, said, second = re.match(r'^("[^"]+")([^"]+)("[^"]+")', segment).groups()
    print first
    print said
    print second

>>> 
"I can quite understand your thinking so."
 I said. 
"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"
"Of course, in your position of unofficial adviser and helper to everybody who is absolutely puzzled, throughout three continents, you are brought in contact with all that is strange and bizarre. But here"
--I picked up the morning paper from the ground--
"let us put it to a practical test. Here is the first heading upon which I come. 'A husband's cruelty to his wife.' There is half a column of print, but I know without reading it that it is all perfectly familiar to me. There is, of course, the other woman, the drink, the push, the blow, the bruise, the sympathetic sister or landlady. The crudest of writers could invent nothing more crude."
person Inbar Rose    schedule 21.10.2013
comment
спасибо @Inbar, регулярное выражение творит чудеса. интересно, вы пробовали использовать регулярное выражение для данных в исходном сообщении, я получаю 'NoneType' object has no attribute 'groups' - person alvas; 21.10.2013
comment
это потому, что ^("[^"]+") указывает, что кавычки должны быть началом предложения? у меня есть шум в начале предложения. - person alvas; 21.10.2013
comment
Затем удалите первый ^, который указывает начало строки/строки/совпадения, и вместо re.match используйте re.search. В следующий раз вы должны указать свои фактические данные в их правильной форме, для которой вы хотите получить решение, иначе вы не получите ответа, который вас устраивает. - person Inbar Rose; 21.10.2013
comment
почему я не могу сопоставить группу с re.match после удаления ^? - person alvas; 21.10.2013
comment
Поскольку re.match начинает поиск с начала строки, а re.search будет искать в строке до тех пор, пока не сможет начать совпадать... Прочтите документацию в следующий раз. - person Inbar Rose; 21.10.2013
comment
ааа... удаление ^ и использование re.search работает =) Спасибо!! - person alvas; 21.10.2013