[www_shared] 08/19: follow-up, add rss

gnunet-svn

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[www_shared] 08/19: follow-up, add rss

From:	gnunet
Subject:	[www_shared] 08/19: follow-up, add rss
Date:	Sat, 25 Jan 2020 11:29:23 +0100

This is an automated email from the git hooks/post-receive script.

ng0 pushed a commit to branch master
in repository www_shared.

commit 928b18e2f2abd2f0717de30a768428b8ac5fc799
Author: ng0 <address@hidden>
AuthorDate: Thu Nov 14 01:40:09 2019 +0000

    follow-up, add rss
---
 make_rss.py | 96 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 rss.xml.j2  | 28 ++++++++++++++++++
 site.py     | 20 +++++++++++--
 textproc.py | 27 +++++++++++++++++
 time.py     | 29 +++++++++++++++++++
 5 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/make_rss.py b/make_rss.py
new file mode 100644
index 0000000..8dd6268
--- /dev/null
+++ b/make_rss.py
@@ -0,0 +1,96 @@
+# This file is part of www_shared.
+# (C) 2019 GNUnet e.V.
+#
+# Authors:
+# Author: ng0 <address@hidden>
+#
+# Permission to use, copy, modify, and/or distribute this software for any
+# purpose with or without fee is hereby granted.
+#
+# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE
+# LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
+# OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
+# WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
+# ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
+# THIS SOFTWARE.
+#
+# SPDX-License-Identifier: 0BSD
+#
+# process a number of .xml.j2 files with jinja2 and output
+# a .xml file which according to the template results in a
+# spec conform rss file. There could be more than one file,
+# so we do it this way.
+#
+# this generator in the current form is rather simplistic and assumes
+# too much structure in the yaml file, which should be improved
+# eventually.
+
+from pathlib import Path, PurePath
+import re
+import codecs
+from inc.time import time_rfc822, time_now, conv_date_rfc822
+
+debug=0
+
+def make_rss(directory, conf, env):
+    if debug > 1:
+        _ = Path(".")
+        q = list(_.glob("**/*.j2"))
+        print(q)
+    for infile in Path(directory).glob("*.xml.j2"):
+        infile = str(infile)
+        if debug > 1:
+            print(infile)
+        name, ext = re.match(r"(.*)\.([^.]+)$", infile.rstrip(".j2")).groups()
+        tmpl = env.get_template(infile)
+
+        def self_localized(other_locale):
+            """
+            Return absolute URL for the current page in another locale.
+            """
+            return "https://"; + conf["siteconf"]["baseurl"] + "/" + 
other_locale + "/" + infile.replace(directory + '/', '').rstrip(".j2")
+
+        def url_localized(filename):
+            return "https://"; + conf["siteconf"]["baseurl"] + "/" + locale + 
"/" + filename
+
+        def url_static(filename):
+            return "https://"; + conf["siteconf"]["baseurl"] + "/static/" + 
filename
+
+        def url_dist(filename):
+            return "https://"; + conf["siteconf"]["baseurl"] + "/dist/" + 
filename
+
+        def url(x):
+            return "https://"; + conf["siteconf"]["baseurl"] + "/" + x
+        
+        for l in list(x for x in Path(".").glob("locale/*/") if x.is_dir()):
+            locale = str(PurePath(l).name)
+            if debug > 1:
+                print(locale)
+            content = tmpl.render(lang=locale,
+                                  url=url,
+                                  now=time_rfc822(time_now()),
+                                  conv_date_rfc822=conv_date_rfc822,
+                                  siteconf=conf["siteconf"],
+                                  newsposts=conf["newsposts"],
+                                  self_localized=self_localized,
+                                  url_localized=url_localized,
+                                  url_static=url_static,
+                                  url_dist=url_dist,
+                                  filename=name + "." + ext)
+            outname = "./rendered/" + locale + "/" + infile.replace(directory 
+ "/", '').rstrip(".j2")
+            outdir = Path("rendered")
+            langdir = outdir / locale
+            try:
+                langdir.mkdir(parents=True, exist_ok=True)
+            except e as FileNotFoundError:
+                print(e)
+
+            with codecs.open(outname, "w", encoding='utf-8') as f:
+                try:
+                    if debug > 1:
+                        print(Path.cwd())
+                    f.write(content)
+                except e as Error:
+                    print(e)
diff --git a/rss.xml.j2 b/rss.xml.j2
new file mode 100644
index 0000000..bdd9b69
--- /dev/null
+++ b/rss.xml.j2
@@ -0,0 +1,28 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom";>
+  {% for siteconfitem in siteconf %}
+    <channel>
+      <atom:link href="https://{{ siteconfitem['baseurl'] }}/{{ lang }}{{ 
siteconfitem['newsloc'] }}rss.xml" rel="self" type="application/rss+xml" />
+      <title>{{ siteconfitem['rsstitle'] }}</title>
+      <language>{{ lang }}</language>
+      <description>{{ siteconfitem['rssdescr']|e }}</description>
+      <link>https://{{ siteconfitem['baseurl'] }}/</link>
+      <lastBuildDate>{{ now }}</lastBuildDate>
+    </channel>
+    {% for newspostitem in newsposts %}
+      <item>
+        <guid>https://{{ siteconfitem['baseurl'] }}/{{ lang }}{{ 
siteconfitem['newsloc'] }}{{ newspostitem['page'] }}</guid>
+        <link>https://{{ siteconfitem['baseurl'] }}/{{ lang }}{{ 
siteconfitem['newsloc'] }}{{ newspostitem['page'] }}</link>
+        <pubDate>{{ conv_date_rfc822(newspostitem["date"]) }}</pubDate>
+        <title>{{ newspostitem['title']|e }}</title>
+        <description>
+          <![CDATA[
+          <article>
+          {{ newspostitem['content'] }}
+          </article>
+          ]]>
+        </description>
+      </item>
+    {% endfor %}
+  {% endfor %}
+</rss>
diff --git a/site.py b/site.py
index 41e5e40..e12dfac 100644
--- a/site.py
+++ b/site.py
@@ -9,9 +9,9 @@ import jinja2
 from pathlib import Path, PurePosixPath, PurePath
 from ruamel.yaml import YAML
 import inc.i18nfix as i18nfix
-from inc.textproc import cut_news_text
+from inc.textproc import cut_news_text, cut_article
 from inc.fileproc import copy_files, copy_tree
-
+from inc.make_rss import *
 
 class gen_site:
     def __init__(self, debug):
@@ -40,6 +40,21 @@ class gen_site:
         if self.debug:
             print("[done] generating abstracts")
 
+    def gen_newspost_content(self, conf, name, member, pages, lang):
+        if self.debug:
+            print("generating newspost content...")
+        for item in conf[name]:
+            item[member] = cut_article(item[pages], conf, lang)
+        if self.debug:
+            print("cwd: " + str(Path.cwd()))
+        if self.debug > 1:
+            print(conf["newsposts"])
+        if self.debug:
+            print("[done] generating newspost content")
+
+    def gen_rss(self, directory, conf, env):
+        make_rss(directory, conf, env)
+
     def run(self, root, conf, env):
         # root = "../" + root
         if self.debug > 1:
@@ -116,6 +131,7 @@ class gen_site:
                 content = tmpl.render(lang=locale,
                                       lang_full=conf["langs_full"][locale],
                                       url=url,
+                                      siteconf=conf["siteconf"],
                                       meetingnotesdata=conf["meetingnotes"],
                                       newsdata=conf["newsposts"],
                                       videosdata=conf["videoslist"],
diff --git a/textproc.py b/textproc.py
index f3b97d3..e94cded 100644
--- a/textproc.py
+++ b/textproc.py
@@ -37,3 +37,30 @@ def cut_text(filename, count):
 
 def cut_news_text(filename, count):
     return cut_text("news/" + filename + ".j2", count)
+
+
+# TODO: replace id='...' with frontier so that we can
+# pass it in cut_article reusable, or merge cut_text and
+# cut_by_frontier.
+def cut_by_frontier(filename):
+    with open(filename) as html:
+        soup = BeautifulSoup(html, features="lxml")
+        k = []
+        for i in soup.find(id='newspost-content'):
+            k.append(i)
+        b = ''.join(str(e) in k)
+        text = b.replace("\n", "")
+        return text
+
+
+def cut_article(filename, conf, lang):
+    return cut_all("news/" + filename + ".j2", conf, lang)
+
+def cut_all(filename, conf, lang):
+    with open(filename) as html:
+        soup = BeautifulSoup(html, features="lxml")
+        i = repr(soup).replace('<html><body><p>{% extends "common/news.j2" 
%}\n{% block body_content %}\n  </p>', "").replace('\n{% endblock body_content 
%}\n</body></html>', "")
+        urlstr = "https://"; + conf["siteconf"][0]["baseurl"] + "/" + lang + "/"
+        text = i.replace("\n", "").replace("{{ url_localized('", 
urlstr).replace("') }}", "")
+        # .replace('<', '&lt;').replace('>', '&gt;').replace('"', '&quot;')
+        return text
diff --git a/time.py b/time.py
new file mode 100644
index 0000000..aa578f9
--- /dev/null
+++ b/time.py
@@ -0,0 +1,29 @@
+import time
+import datetime
+import email.utils
+
+def time_now():
+    return datetime.datetime.now()
+
+def conv_date(t):
+    # naively assumes its input is always a Y-m-d.
+    if type(t) == str:
+        i = datetime.datetime.strptime(t, "%Y-%m-%d").timetuple()
+        return time.mktime(i)
+    elif type(t) == datetime.date:
+        i = t.timetuple()
+        return time.mktime(i)
+    else:
+        return sys.exit(1)
+
+def conv_date_rfc822(t):
+    return time_rfc822(conv_date(t))
+
+def time_rfc822(t):
+    if type(t) == float:
+        return email.utils.formatdate(t)
+    elif type(t) == datetime.datetime:
+        return email.utils.format_datetime(t)
+    else:
+        return sys.exit(1)
+

-- 
To stop receiving notification emails like this one, please contact
address@hidden.

[Prev in Thread]

Current Thread

[Next in Thread]

[www_shared] branch master created (now 6a51c97), gnunet, 2020/01/25
- [www_shared] 02/19: add README., gnunet, 2020/01/25
- [www_shared] 04/19: LICENSE., gnunet, 2020/01/25
- [www_shared] 06/19: readme, gnunet, 2020/01/25
- [www_shared] 01/19: init from www.git, gnunet, 2020/01/25
- [www_shared] 03/19: drop sum.py, remove function from fileproc., gnunet, 2020/01/25
- [www_shared] 09/19: rss2.0 spec., gnunet, 2020/01/25
- [www_shared] 14/19: add macro for single link preview, gnunet, 2020/01/25
- [www_shared] 08/19: follow-up, add rss, gnunet <=
- [www_shared] 05/19: add sitemap generator script in python., gnunet, 2020/01/25
- [www_shared] 07/19: debug print statements., gnunet, 2020/01/25
- [www_shared] 11/19: Merge branch 'default' of gnunet.org:www_shared into default, gnunet, 2020/01/25
- [www_shared] 17/19: news.macro.j2: use date: title for shortnews, gnunet, 2020/01/25
- [www_shared] 12/19: merge README.text and README, gnunet, 2020/01/25
- [www_shared] 18/19: avoid need to have Makefile set PYTHONPATH (breaks then other things), gnunet, 2020/01/25
- [www_shared] 16/19: remove forgotten todo file, gnunet, 2020/01/25
- [www_shared] 13/19: Add copyright header, fix copyright headers., gnunet, 2020/01/25
- [www_shared] 10/19: licensing, notes, todo, readme, gnunet, 2020/01/25
- [www_shared] 15/19: site.py: expose conf directly to jinja2., gnunet, 2020/01/25

Prev by Date: [www_shared] 14/19: add macro for single link preview
Next by Date: [www_shared] 05/19: add sitemap generator script in python.
Previous by thread: [www_shared] 14/19: add macro for single link preview
Next by thread: [www_shared] 05/19: add sitemap generator script in python.
Index(es):
- Date
- Thread