Извлечение переменного количества полей с помощью jq

Я новичок на этом веб-сайте и пришел сюда, потому что действительно борюсь с проблемой извлечения информации из файла JSON. Сложность заключается в том, что количество полей может меняться, поэтому простой синтаксис мне не сойдет с рук.

Вот пример кода:

{
  "addresses": {
    "@count": "1",
    "address_name": {
      "address_spec": {
        "@addr_no": "1",
        "full_address": "Tel Aviv Univ, Eitan Berglas Sch Econ, IL-69978 Tel Aviv, Israel",
        "organizations": {
          "@count": "2",
          "organization": [
            "Tel Aviv Univ",
            {
              "@pref": "Y",
              "#text": "Tel Aviv University"
            }
          ]
        },
        "suborganizations": {
          "@count": "1",
          "suborganization": "Eitan Berglas Sch Econ"
        },
        "city": "Tel Aviv",
        "country": "Israel",
        "zip": {
          "@location": "BC",
          "#text": "IL-69978"
        }
      }
    }
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "1",
    "address_name": {
      "address_spec": {
        "@addr_no": "1",
        "full_address": "MIT, Cambridge, MA 02139 USA",
        "organizations": {
          "@count": "2",
          "organization": [
            "MIT",
            {
              "@pref": "Y",
              "#text": "Massachusetts Institute of Technology (MIT)"
            }
          ]
        },
        "city": "Cambridge",
        "state": "MA",
        "country": "USA",
        "zip": {
          "@location": "AP",
          "#text": "02139"
        }
      }
    }
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "2",
    "address_name": [
      {
        "address_spec": {
          "@addr_no": "1",
          "full_address": "Univ Kentucky, Lexington, KY 40506 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Univ Kentucky",
              {
                "@pref": "Y",
                "#text": "University of Kentucky"
              }
            ]
          },
          "city": "Lexington",
          "state": "KY",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "40506"
          }
        }
      },
      {
        "address_spec": {
          "@addr_no": "2",
          "full_address": "Univ Bonn, ZEI, D-5300 Bonn, Germany",
          "organizations": {
            "@count": "2",
            "organization": [
              "Univ Bonn",
              {
                "@pref": "Y",
                "#text": "University of Bonn"
              }
            ]
          },
          "suborganizations": {
            "@count": "1",
            "suborganization": "ZEI"
          },
          "city": "Bonn",
          "country": "Germany",
          "zip": {
            "@location": "BC",
            "#text": "D-5300"
          }
        }
      }
    ]
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "1",
    "address_name": {
      "address_spec": {
        "@addr_no": "1",
        "full_address": "Harvard Univ, Cambridge, MA 02138 USA",
        "organizations": {
          "@count": "2",
          "organization": [
            "Harvard Univ",
            {
              "@pref": "Y",
              "#text": "Harvard University"
            }
          ]
        },
        "city": "Cambridge",
        "state": "MA",
        "country": "USA",
        "zip": {
          "@location": "AP",
          "#text": "02138"
        }
      }
    }
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "3",
    "address_name": [
      {
        "address_spec": {
          "@addr_no": "1",
          "full_address": "Columbia Univ, New York, NY 10027 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Columbia Univ",
              {
                "@pref": "Y",
                "#text": "Columbia University"
              }
            ]
          },
          "city": "New York",
          "state": "NY",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "10027"
          }
        }
      },
      {
        "address_spec": {
          "@addr_no": "2",
          "full_address": "NYU, New York, NY USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "NYU",
              {
                "@pref": "Y",
                "#text": "New York University"
              }
            ]
          },
          "city": "New York",
          "state": "NY",
          "country": "USA"
        }
      },
      {
        "address_spec": {
          "@addr_no": "3",
          "full_address": "Univ Pompeu Fabra, Barcelona, Spain",
          "organizations": {
            "@count": "2",
            "organization": [
              "Univ Pompeu Fabra",
              {
                "@pref": "Y",
                "#text": "Pompeu Fabra University"
              }
            ]
          },
          "city": "Barcelona",
          "country": "Spain"
        }
      }
    ]
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "2",
    "address_name": [
      {
        "address_spec": {
          "@addr_no": "1",
          "full_address": "Univ Chicago, Chicago, IL 60637 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Univ Chicago",
              {
                "@pref": "Y",
                "#text": "University of Chicago"
              }
            ]
          },
          "city": "Chicago",
          "state": "IL",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "60637"
          }
        }
      },
      {
        "address_spec": {
          "@addr_no": "2",
          "full_address": "Amer Bar Fdn, Chicago, IL 60611 USA",
          "organizations": {
            "@count": "1",
            "organization": "Amer Bar Fdn"
          },
          "city": "Chicago",
          "state": "IL",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "60611"
          }
        }
      }
    ]
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "2",
    "address_name": [
      {
        "address_spec": {
          "@addr_no": "1",
          "full_address": "Ohio State Univ, Columbus, OH 43210 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Ohio State Univ",
              {
                "@pref": "Y",
                "#text": "Ohio State University"
              }
            ]
          },
          "city": "Columbus",
          "state": "OH",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "43210"
          }
        }
      },
      {
        "address_spec": {
          "@addr_no": "2",
          "full_address": "Harvard Univ, Cambridge, MA 02138 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Harvard Univ",
              {
                "@pref": "Y",
                "#text": "Harvard University"
              }
            ]
          },
          "city": "Cambridge",
          "state": "MA",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "02138"
          }
        }
      }
    ]
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "1",
    "address_name": {
      "address_spec": {
        "@addr_no": "1",
        "full_address": "Univ Chicago, Chicago, IL 60637 USA",
        "organizations": {
          "@count": "2",
          "organization": [
            "Univ Chicago",
            {
              "@pref": "Y",
              "#text": "University of Chicago"
            }
          ]
        },
        "city": "Chicago",
        "state": "IL",
        "country": "USA",
        "zip": {
          "@location": "AP",
          "#text": "60637"
        }
      }
    }
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "2",
    "address_name": [
      {
        "address_spec": {
          "@addr_no": "1",
          "full_address": "Wissensch Zentrum Berlin Sozialforsch, D-1000 Berlin, Germany",
          "organizations": {
            "@count": "1",
            "organization": "Wissensch Zentrum Berlin Sozialforsch"
          },
          "city": "Berlin",
          "country": "Germany",
          "zip": {
            "@location": "BC",
            "#text": "D-1000"
          }
        }
      },
      {
        "address_spec": {
          "@addr_no": "2",
          "full_address": "Harvard Univ, Dept Govt, Cambridge, MA 02138 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Harvard Univ",
              {
                "@pref": "Y",
                "#text": "Harvard University"
              }
            ]
          },
          "suborganizations": {
            "@count": "1",
            "suborganization": "Dept Govt"
          },
          "city": "Cambridge",
          "state": "MA",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "02138"
          }
        }
      }
    ]
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}
{
  "addresses": {
    "@count": "2",
    "address_name": [
      {
        "address_spec": {
          "@addr_no": "1",
          "full_address": "NYU, CV Starr Ctr Appl Econ, New York, NY 10003 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "NYU",
              {
                "@pref": "Y",
                "#text": "New York University"
              }
            ]
          },
          "suborganizations": {
            "@count": "1",
            "suborganization": "CV Starr Ctr Appl Econ"
          },
          "city": "New York",
          "state": "NY",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "10003"
          }
        }
      },
      {
        "address_spec": {
          "@addr_no": "2",
          "full_address": "Princeton Univ, Princeton, NJ 08544 USA",
          "organizations": {
            "@count": "2",
            "organization": [
              "Princeton Univ",
              {
                "@pref": "Y",
                "#text": "Princeton University"
              }
            ]
          },
          "city": "Princeton",
          "state": "NJ",
          "country": "USA",
          "zip": {
            "@location": "AP",
            "#text": "08544"
          }
        }
      }
    ]
  },
  "category_info": {
    "headings": {
      "@count": "1",
      "heading": "Social Sciences"
    },
    "subjects": {
      "@count": "3",
      "subject": [
        {
          "@ascatype": "traditional",
          "#text": "Economics"
        },
        {
          "@ascatype": "extended",
          "#text": "Business & Economics"
        },
        {
          "@ascatype": "traditional",
          "#text": "ECONOMICS"
        }
      ]
    }
  }
}

То, что я надеялся извлечь, - это страна для каждой из записей (некоторые записи имеют более одной страны, что, кажется, вызывает проблему). Итак, мой наивный подход состоял в том, чтобы сказать:

.static_data."fullrecord_metadata".addresses.address_name.country

Однако это дает мне несколько ошибок (null не имеет ключей и не может индексировать массив со строкой). Проверка с помощью команды keys:

.static_data."fullrecord_metadata".addresses.address_name | keys

Я вижу, что есть проблема со структурой данных...

Итак, не могли бы вы подсказать, могу ли я извлечь список стран для каждой записи с помощью jq? Благодарю вас!


person econ    schedule 19.12.2015    source источник


Ответы (2)


Для каждого входного объекта JSON верхнего уровня следующий фильтр будет рекурсивно проверять все объекты, чтобы увидеть, есть ли у них ключ «страна», и затем он сообщит об отдельных значениях «страны» для этого объекта верхнего уровня:

jq -c '[.. | if type == "object" and has("country") 
             then .country
             else empty end] | unique' 
["Israel"]
["USA"]
["Germany","USA"]
["USA"]
["Spain","USA"]
["USA"]
["USA"]
["USA"]
["Germany","USA"]
["USA"]

Вот фильтр, который даст те же результаты в вашем примере, хотя он не совсем эквивалентен:

[.. | .country? // empty] | unique

[Упражнение для заинтересованного читателя: в чем разница? :-) ]

person peak    schedule 19.12.2015
comment
Большое спасибо! :) Это слишком сложно для меня, чтобы понять это самостоятельно. - person econ; 19.12.2015
comment
.country? // empty означает, что следует использовать .country, если он существует и не равен нулю, иначе он будет заменен пустым. unique удаляет повторяющиеся записи. - person econ; 20.12.2015

Вот решение, которое использует функцию для обработки изменения в .address_name

 def address_specs:
    if type == "array" then .[].address_spec else .address_spec end
 ;

 .addresses | .address_name | [address_specs | .country] | unique
person jq170727    schedule 26.08.2017