This is where developers and integrators write about Plone, and is your best source for news and developments from the community.
July 11, 2014
July 10, 2014
by Maurizio Delmonte at 2014-07-10T15:51:25Z
by Clayton Parker at 2014-07-10T15:40:00Z
This is a recap of my presentation at the Plone Symposium Midwest 2014.
Managing Chaos: Merging 120 Sites into a single Plone Multisite Solution
Who Am I?
- Director of Engineering, Six Feet Up
- Organizer, IndyPy, Indianapolis Python Users Group
What will we Learn?
This talk covers:
- Six Feet Up's multisite solution with Plone and Lineage
- How we went about consolidating 120 Plone sites into one multisite solution in less than 90 days
- How this improved performance
Penn State has been a long standing client with Six Feet Up. The College of Liberal Arts asked us to look into the performance of their 120 eLearning course sites. We saw this as a great opportunity for them to save time and money by consolidating everything instead of maintaining all the separate sites.
Old Site Creation Workﬂow
One of the main issues with the old implementation was that there were 120+ copies of all the objects needed to create a Plone site. That means 120 catalogs, Quickinstallers, properties, registries, etc. There was a lot of needless duplication in the scenario which hurt the performance of each site. Since they were all housed in one Data.fs, there was no easy way to avoid the loading of all these duplicate objects.
How is it made?
For the department and course types we utilized Lineage, an open source Plone product built by Six Feet Up. Lineage is a simple product that allows the course or department to appear as an autonomous Plone. It utilizes the NavigationRoot in Plone to make the navigation menu, search, portlets and anything else that does a search appear to be rooted at that level.
New Site Creation Workﬂow
Now, whenever a new course needs to be added, it's just like creating any other new content in Plone. In each department folder there is the option under "Add new..." to add a new Course folder.
These course folders have additional fields for the author, course number and banner images. Things that were previously manually filled out are now just a part of the content creation process.
In addition to having the fields on the type, events are used to create content and automatically add faculty and staff to the course.
Since we are still utilizing Plone and it's folder structure, we can still use the built-in permission system. Global roles and groups can apply to the whole site or local roles can be given to a user or group at a department and course level. This provides an easier way to manage users across the 120+ sites.
There are a few disadvantages that come along with a system housed in one Plone site. If there was a need to split out a course or department into a new site, this would require a migration.
Since everything is in one Plone site, any add-ons or properties are going to apply to the whole site. It would be more difficult to restrict the functionality of an add-on to one particular course or department.
On the flip side, having one set of add-ons to manage can be easier than 120 different configurations of installed add-ons. Upgrading or re-installing is more of a one click process with less headaches.
Since the one Plone site houses all the content, it can be easily shared across departments or courses. No need for any external access, it can just be used directly.
Upgrading the Plone sites will be much easier moving forward. Instead of having to deal with 120+ migrations, there is just one.
The biggest advantage here was the performance boost that was gained. The system can handle the load of all those procrastinating students logging in on Sunday to finish their assignments much better now!
July 09, 2014
by Mikko Ohtamaa at 2014-07-09T11:56:25Z
This blog posts presents rolling time window counting and rate limiting in Redis. You can apply it to activate login CAPTCHA on your site only when it is needed. For the syntax highlighted Python source code please see the original blog post.
Table Of Content
1. About Redis
Redis is a key-value store and persistent cache. Besides normal get/set functionality it offers more complex data structures like lists, hashes and sorted sets. If you are familiar with memcached think Redis as memcached with steroids.
Often Redis is used for rate limiting purposes. Usually the rate limit recipes are count how many times something happens on a certain second or a certain minute. When the clock ticks to the next minute, rate limit counter is reset back to the zero. This might be problematic if you are looking to limit rates where hits per integration time window is very low. If you are looking to limit to the five hits per minute, in one time window you get just one hit and six in another, even though the average over two minutes is 3.5.
This posts presents an Python example how to do a rolling time window based counting, so that rate counting does not reset itself back to the zero in any point, but counts hits over X seconds to the past. This is achieved using Redis sorted sets.
If you know any better way to do this with Redis – please let me know – I am no expert here. This is the first implementation I figured out.
""" Redis rolling time window counter and rate limit. Use Redis sorted sets to do a rolling time window counters and limiters. http://redis.io/commands/zadd """ import time def check(redis, key, window=60, limit=50): """ Do a rolling time window counter hit. :param redis: Redis client :param key: Redis key name we use to keep counter :param window: Rolling time window in seconds :param limit: Allowed operations per time window :return: True is the maximum limit has been reached for the current time window """ # Expire old keys (hits) expires = time.time() - window redis.zremrangebyscore(key, '-inf', expires) # Add a hit on the very moment now = time.time() redis.zadd(key, now, now) # If we currently have more keys than limit, # then limit the action if redis.zcard(key) > limit: return True return False def get(redis, key): """ Get the current hits per rolling time window. :param redis: Redis client :param key: Redis key name we use to keep counter :return: int, how many hits we have within the current rolling time window """ return redis.zcard(key)
3. Problematic CAPTCHAs
Everybody of us hates CAPTCHAs. They are two-edged swords. On one hand, you need to keep bots out from your site. On the other, CAPTCHAs are turn off for your site visitors and they drive away potential users.
4. CAPTCHAs and different login situations
There are three cases where you want the user to complete CAPTCHA for login
- Somebody is bruteforcing a single username (targeted attack): you need to count logins per usename and not let the login proceed if this user is getting too many logins.
- Somebody is going through username/password combinations for a single IP: you count logins per IP.
- Somebody is going through username/password combinations and the attack comes from very large IP pool. Usually these are botnet-driven attacks and the attacker can easily have tens of thousands of IP addresses to burn.
The botnet-driven login attack is tricky to block. There might be only one login attempt from each IP. The only way to effectively stop the attack is to present pre-login CAPTCHA i.e. the user needs to solve the CAPTCHA even before the login can be attempted. However pre-login CAPTCHA is very annoying usability wise – it prevents you to use browser password manager for quick logins and sometimes gives you extra headache of two minutes before you get in to your favorite site.
Even services like CloudFlare do not help you here. Because there is only one request per single IP, they cannot know beforehand if the request is going to be legitimate or not (though they have some global heurestics and IP blacklists for sure). You can flip on the “challenge” on your site, so that every visitors must complete the CAPTCHA before they can access your site and this is usability let down again.
5. Mitigating botnet-driven login attack with on-situation CAPTCHA
You can have the best of the both worlds: no login CAPTCHA and still mitigate botnet-driven login atttacks. This can be done by
- Monitoring your site login rate
- In normal situation do not have pre-login CAPTCHA
- When there is clearly an abnormal login rate, which means there might be an attack going on, enable the pre-login CAPTCHA for certain time
Below is an pseudo-Python example how this can be achieved with using rollingwindow Python module from the above.
from redis_cache import get_redis_connection import rollingwindow #: Redis sorted set key counting login attempts REDIS_LOGIN_ATTEMPTS_COUNTER = "login_attempts" #: Key telling that CAPTCHA become activated due to #: high login attempts rate REDIS_CAPTCHA_ACTIVATED = "captcha_activated" #: Captcha mode expires in 120 minutes (attack cooldown) CAPTCHA_TIMEOUT = 120 * 60 #: Are you presented CAPTCHA when logging in first time #: Disabled in unit tests. LOGIN_ATTEMPTS_CHALLENGE_THRESHOLD = 500 # per minute def clear(): """ Resets the challenge system state, per system or per IP. """ redis = get_redis_connection("redis") redis.delete(REDIS_CAPTCHA_ACTIVATED) redis.delete(REDIS_LOGIN_ATTEMPTS_COUNTER) def get_login_rate(): """ :return: System global login rate per minute for metrics """ redis = get_redis_connection("redis") return rollingwindow.get(redis, REDIS_LOGIN_ATTEMPTS_COUNTER) def check_captcha_needed(redis): """ Check if we need to enable login CAPTCHA globally. Increase login page load/submit counter. :return: True if our threshold for login page loads per minute is exceeded """ # Count a hit towards login rate threshold_exceeded = rollingwindow.check(redis, REDIS_LOGIN_ATTEMPTS_COUNTER, limit=LOGIN_ATTEMPTS_CHALLENGE_THRESHOLD) # Are we in attack mode if not redis.get(REDIS_CAPTCHA_ACTIVATED): if not threshold_exceeded: # No login rate threshold exceeded, # and currently CAPTCHA not activated -> # allow login without CAPTCHA return False # Login attempt threshold exceeded, # we might be under attack, # activate CAPTCHA mode redis.setex(REDIS_CAPTCHA_ACTIVATED, "true", CAPTCHA_TIMEOUT) return True def login(request): redis = get_redis_connection("redis") if check_captcha_needed(request): # ... We need to CAPTCHA before this login can proceed .. else: # ... Allow login to proceed without CAPTCHA ...
by Jens W. Klein at 2014-07-09T10:55:00Z
Nowdays we use Bootstrap a lot as the base for our sites. But Bootstrap has a different grid-system, classes, structure and so on. So transforming Plones common viewlets, portlets and content into a bootstrap theme is fine. To also transform all the editing styles is much effort.
We prefer to theme the site with in-place editing. This makes sense for bigger customers or Intranet/Extranet sites.
If - for a public site - budget is limited, it saves time and money to not have in-place editing in the themed site.
Therefore we often use Diazo for the public face of a site while using Sunburst default-theme for editing.
This kind of customer has 1-3 people doing all the editing. Its easy to teach them how it works. If a login is still possible in the public facing site preview of private content is not a problem at all.
In this case we use two domains: the www.customer.tld and a cms.customer.tld.
So it comes that we need UI elements only for the themed site. Placing them in editing mode would confuse editors and it would need extra effort to style them at least minimal.
To overcome this we apply a BrowserLayer only if Diazo is active. Technically this it is done using a before traversal event subscriber. Here the subscriber.py:
from plone.app.theming.utils import isThemeEnabled from zope.interface import alsoProvides from zope.interface import Interface class IDiazoMarkerLayer(Interface): """Layer Marker Interface applied if Diazo is active """ def apply_diazo_layer(obj, event): if IDiazoMarkerLayer.providedBy(event.request) \ or not isThemeEnabled(event.request): return alsoProvides(event.request, DiazoMarkerLayer)
register the subscriber in configure.zcml:
... <subscriber for="Products.CMFPlone.interfaces.IPloneSiteRoot zope.traversing.interfaces.IBeforeTraverseEvent" handler=".subscriber.apply_diazo_layer" /> ...
No more needed!
The IDiazoMarkerInterface can be used as a usal Browserlayer: Register a view, viewlet or even a jbot template override in zcml with layer="IDiazoMarkerInterface" and it will only appear if the Diazo theme is active.
You can also think this further and add other conditions, it depends really on your use-case.
by Schlepp at 2014-07-09T03:39:43Z
Interestingly, Python holds its number 4 spot overall and for all individual language types. PHP follows in the number 6 slot. PERL struggles along at 8 or lower, depending on how you select the filter settings.
That said, it looks like Python is very much a contender in the programming language debate. I used to teach Python to my class in algorithms because the class text, Corman and Leiserson, Rivest, and Stein's "Introduction to Algorithms," uses Python-ish pseudocode. In fact, we'd often cut and paste their pseudocode, make a few edits, and run the algorithms with time tracking functions. Racing algorithms--how to make O-notation fun.
That, of course, brings us back to CMS. A search on CMS Matrix for Python-based systems returns 23 results:
- BlackMonk CMS
- DXM Multilingual
- Easy Publisher
- eContent 3.5
- Macromedia Contribute
- Nuxeo CPS
- Web Cube
- WEB123 CMS
- WebEngine v6
That's a fascinating mismatch between the IEEE ranking of the programming language and number of CMSs based on the underlying language. Considering that Plone and Django are the heavy hitters among the Python-based CMSs, this puts them in a positive light.
July 08, 2014
by Giorgio Borelli at 2014-07-08T17:54:00Z
I just released Morepath 0.4.1. This fixes a regression with Python 3 compatibility and has a few other minor tweaks to bring test coverage back up to 100%.
I had broken Python 3 support in Morepath 0.4. I'm still not in the habit of running 'tox' before a release, so I find out about these problems too late.
I'll go into a bit of detail about this issue, as it's a mildly amusing example of writing Python code being more complicated than it should be.
Morepath 0.4 broke in Python 3 because I introduced a metaclass for the morepath.App class. I usually avoid metaclasses as they are a source of unpredictability and complexity, but the best solution I saw here was one. It's a very limited one.
One task of the metaclass is to attach to the class with Venusian. Venusian is a library that lets you write decorators that don't execute during import time but later. This is nice as import time side effects can be a source of trouble.
Venusian also lets you attach a callback to a Python object (such as a class) outside of a decorator. That's what I was doing; attaching to a class, in my metaclass.
Venusian determines in what context the decorator was called, such as module-level and class-level, so you can use that later. For this it inspects the Python stack frame of its caller.
My first attempt to make the metaclass work in Python 3 was to use the with_metaclass functionality from the future compatibility layer. I am using this library anyway in Reg, which is a dependency of Morepath, so using it would not introduce a new dependency for Morepath.
Unfortunately after making that change my tests broke in both Python 2 and Python 3. That's not an improvement over having the tests being broken in just Python 2!
It appears that with_metaclass introduces a new stack frame into the mix somewhere, which breaks Venusian's assumptions. Now Venusian's attach has a depth argument to determine where in the stack to check, so I increased the stack depth by one and ran the tests again. Less tests broke than before, but quite a few still did. I think the cause is that the stack depth of with_metaclass is just not consistent for whatever reason.
Digging around in the future package I saw it includes a copy of six, another compatibility layer project. six has a name close to my heart -- long ago I originated the Five project for compatibility between Zope 2 and Zope 3.
That copy of six had another version of with_metaclass. I tried using future.util.six.with_metaclass, and hey, it all started working suddenly. All tests passed, in both Python 2 and Python 3. Yay!
Okay then, I figured, I don't want to depend on a copy of six that just happens to be lying about in future. It's not part of its public API as far as I understand. So I figured I should introduce a new dependency for Morepath after all, on six. It's not a big deal; Morepath's testing dependencies include WebTest, and this already has a dependency on six.
But when I pulled in six proper, I got a newer version of it than the one in future.util.six, and it caused the same test breakages as with future. Argh!
So I copied the code from old-six into Morepath's compat module. It's a two-liner anyway. It works for me. Morepath 0.4.1 done and released.
But I don't know why six had to change its version, and why future's version is different. It worries me -- they probably have good reasons. Are those reasons going to break my code at some point in the future?
Being a responsible open source citizen, I left bug reports about my experiences in the six and future issue trackers:
I much prefer writing Python code. Polyglot is an inferior programming language as it introduces complexities like this. But Polyglot is what we got.
July 07, 2014
I've just released Morepath 0.4!
Morepath 0.4 is a Python web framework that's small ("micro") and packs a lot of power. There are a lot of facilities for application reuse. And as opposed to most web frameworks, it actually has some intelligence about generating hyperlinks to objects.
Morepath 0.4 has a breaking change to the way application reuse works. Don't worry, you can fix your code by making a few minor changes. In short, Morepath application objects are now classes, not instances, and you can instantiate this class to get a WSGI object. See the CHANGES for a lot of details on what happened and what you need to do.
The big win is that application reuse in Morepath has become Python subclassing, and that making a WSGI application (even a parameterized one) is just instantiating the class.
The other win is that Morepath gained even more extensibility features, namely the ability for Morepath extension to introduce new Morepath directives (the decorators you see everywhere in Morepath examples). But I can't talk too much about that until I document them properly.
Along with the new Morepath, I've also made the initial release of BowerStatic (announcement). BowerStatic is the WSGI framework that lets you easily include bower-installed resources in your web page and do the right thing with caching (forever, thank you, but on a separate URL for each version).
How does that relate to Morepath, you may ask? Well, today I've also released the Morepath integration for BowerStatic, more.static. I've described in the Morepath documentation what to do to get it working in your Morepath project. The reason Morepath 0.4 had the breaking change was in part to support more.static, which needed the ability to introduce a new Morepath directive among other things.